Top
Best
New

Posted by mips_avatar 12/3/2025

Everyone in Seattle hates AI(jonready.com)
967 points | 1065 commentspage 2
somekyle2 12/3/2025|
Anecdotally, lots of people in SF tech hate AI too. _Most_ people out of tech do. But, enough of the people in tech have their future tied to AI that there are lot of vocal boosters.
tptacek 12/3/2025||
It is not at all my experience working in local government (that is, in close contact with everybody else paying attention to local government) that non-tech people hate AI. It seems rather the opposite.
wk_end 12/3/2025|||
Managers everywhere love the idea of AI because it means they can replace expensive and inefficient human workers with cheap automation.

Among actual people (i.e. not managers) there seems to be a bit of a generation gap - my younger friends (Gen Z) are almost disturbingly enthusiastic about entrusting their every thought and action to ChatGPT; my older friends (young millennials and up) find it odious.

tptacek 12/3/2025|||
The median age of people working local politics is probably 55, and I've met more people (non-family, that is) over 70 doing this than in anything else, and all of them are (a) using AI for stuff and (b) psyched to see any new application of AI being put to use (for instance, a year or so ago, I used 4o to classify every minute spent in our village meetings according to broad subjects).

Or, drive through Worth and Bridgeview in IL, where all the middle eastern people in Chicago live, and notice all the AI billboards. Not billboards for AI, just, billboards obviously made with GenAI.

I think it's just not true that non-tech people are especially opposed to AI.

Capricorn2481 12/4/2025||
> The median age of people working local politics is probably 55, and I've met more people (non-family, that is) over 70 doing this than in anything else, and all of them are (a) using AI for stuff and (b) psyched to see any new application of AI being put to use

That seems more like a canary than anything. This is the demographic that doesn't even know which tech company they're talking to in congress. That's not the demographic in touch with tech. They have gotten more excited about even dumber stuff.

For people under 50, it's a wildly common insult to say something seems AI generated. They are disillusioned with the content slop filling the internet, the fact that 50% of the internet is bots, and their future job prospects.

The only people I've seen liking AI art, like fake cat videos, are people over 50. Not that they don't matter, but they are not the driver of what's popular or sustainable.

sleepybrett 12/3/2025||||
Mangers should realize that the thing AI might be best at is to replace them. Most of my managers don't understand the people they are managing and don't understand what the people they are managing are actually building. They job is to get a question from management that their reports can answer, format that answer for their boss and send the email. They job is to be the leader in a meeting to make sure it stays on track, not understand the content. AI can do all that shit without a problem.
vasvir 12/3/2025||
MANNA https://milweesci.weebly.com/uploads/1/3/2/4/13247648/mannap...
pseudalopex 12/4/2025|||
A Pew Research Center survey found age correlation. But not a generation gap.[1]

[1] https://www.pewresearch.org/science/2025/09/17/ai-in-america...

exasperaited 12/4/2025||||
I live in a medium-sized British town of 100,000 people or so. It may be a slightly more creative town than most — lots of arts and music and a really surprisingly cool music scene — but I can tell you that AI pleases (almost) nobody.

I think actually a lot about it is the sort of crass, unthinking, default-American-college-student manner about the way ChatGPT speaks. It's so American and we can feel it. But AI generated art and music is hugely unpopular, AI chatbots replacing real customer service is something we loathe.

Generally speaking I would say that AI feels like something that is being done to us by a handful of powerful Americans we profoundly distrust (and for good reason: they are untrustworthy and we can see through their bullshit).

I can tell you that this is so different to the way the internet was initially received even by older people. But again, perhaps this is in part due to our changing perspectives on America. It felt like an exciting thing to be part of, and it helped in the media that the Web was the brainchild of a British person (even if twenty years later that same media would have liked to pretend he wasn't at a European research institution when he did it).

The feeling about AI is more like the feeling we have about what the internet eventually did to our culture: destroying our high streets. We know what is coming will not be good for what makes us us.

somekyle2 12/3/2025||||
I don't doubt that many love it. I'm just going based on SF non-tech people I know, who largely see it as the thing vaguely mentioned on every billboard and bus stop, the chatbot every tech company seems to be trying to wedge into every app, and the thing that makes misleading content on social media and enables cheating on school projects. But, sometimes it is good at summarizing videos and such. I probably have a biased sample of people who don't really try to make productive use of AI.
tptacek 12/3/2025||
I can imagine reasons why non-tech people in SF would hate all tech. I work in tech and living in the middle of that was a big part of why I was in such a hurry to get out of there.
pesus 12/3/2025||
Frankly, tech deserves its bad reputation in SF (and worldwide, really).

One look at the dystopian billboards bragging about trying to replace humans with AI should make any sane human angry at what tech has done. Or the rising rents due to an influx of people working on mostly useless AI startups, 90% of which won't be around in 5 years. Or even how poorly many in tech behave in public and how poorly they treat service workers. That's just the tip of the iceberg, and just in SF alone.

I say all this as someone living in SF and working in tech. As a whole, we've brought the hate upon ourselves, and we deserve it.

treis 12/3/2025|||
There's a long list of things that have "replaced" humans all the way back to the ox drawn plow. It's not sane to be angry at any of those steps along the way. GenAI will likely not be any different.
GuinansEyebrows 12/3/2025|||
it's plenty sane to be angry when the benefits of those technical innovations are not distributed equally.
pesus 12/3/2025|||
It is absolutely sane to be angry at people's livelihoods being destroyed and most aspects of life being worsened just so a handful of multi-billionaires that already control society can become even richer.
Hammershaft 12/3/2025||
The plough also made the rich richer, but in the long run the productivity gains it enabled drove improvements to common living standards.
tptacek 12/3/2025|||
I don't agree with any of this. I just think it's aggravating to live in a company town.
majormajor 12/3/2025||||
Non-technical people that I know have rapidly embraced it as "better google where i don't have to do as much work to answer questions." This is in a non-work context so i don't know how much those people are using it to do their day job writing emails or whatever. A lot of these people are tech-using boomers - they already adjusted to Google/the internet, they don't know how it works, they just are like "oh, the internet got even better."

There's maybe a slow trend towards "that's not true, you should know better than to trust AI for that sort of question" in discussions when someone says something like "I asked AI how [xyz was done]" but it's definitely not enough yet to keep anyone from going to it as their first option for answering a question.

neutronicus 12/3/2025||||
Anyone involved in government procurement loves AI, irrespective of what it even is, for the simple fact that they get to pointedly ask every single tech vendor for evidence that they have "leveraged efficiency gains from AI" in the form of a lower bid.

At least, that's my wife's experience working on a contract with a state government at a big tech vendor.

tptacek 12/3/2025||
Not talking about government employees, for whatever that's worth.
kg 12/3/2025|||
EDIT: Removed part of my post that pissed people off for some reason. shrug

It makes a lot of sense that someone casually coming in to use chatgpt for 30 minutes a week doesn't have any reason to think more deeply about what using that tool 'means' or where it came from. Honestly, they shouldn't have to think about it.

tptacek 12/3/2025||
The claim I was responding to implied that non-techies distinctively hate AI. You're a techie.
tokioyoyo 12/3/2025|||
It’s one of those “people hate noticing AI-generated stuff, but everyone and their mom is using ChatGPT to make their works easier”. There are a lot of vocal boosters and vocal anti-boosters, but the general population is using it in a Google fashion and move on. Not everyone is thinking about AI-apocalypse every day.

Personally, I’m in-between the opinions. I hate when I’m consuming AI-generated stuff, but can see the use for myself for work or asking bunch of not-so-important questions to get general idea of stuff.

IAmBroom 12/3/2025|||
Most of my FB contacts are not in tech. It is overwhelming viewed as a negative by them. To be clearer: I'm counting anyone who posts AI-generated pictures on FB as implicitly being pro-AI; if we neglect this portion the only non-negative posts about AI would be highly qualified "in some special cases it is useful" statements.
themafia 12/3/2025|||
> enough of the people in tech have their future tied to AI that there are lot of vocal boosters

That's the presumption. There's no data on whether this is actually true or not. Most rational examinations show that it most likely isn't. The progress of the technology is simply too slow and no exponential growth is on the horizon.

Forgeties79 12/3/2025|||
What’s so striking to me is these “vocal boosters” almost preach like televangelists the moment the subject comes up. It’s very crypto-esque (not a hot take at all I know). I’m just tired of watching these people shout down folks asking legitimate questions pertaining to matters like health and safety.
lambchoppers 12/3/2025||
Health and safety seems irrelevant to me. I complain about cars, I point out "obscure" facts like that they are a major cause of lung related health problems for innocent bystanders, I don't actually ride in cars on any regular basis, I use them less in fact than I use AI. There were people at the car's introduction who made all the points I would make today.

The world is not at all about fairness of benefits and impacts to all people it is about a populist mass and what amuses them and makes their life convenient, hopefully without attending the relevant funerals themselves.

Forgeties79 12/3/2025||
> health and safety seems irrelevant to me

Honestly I don’t really know what to say to that, other than it seems rather relevant to me. I don’t really know what to elaborate on given we disagree on such a fundamental level.

lambchoppers 12/3/2025|||
Do you think the industry will stop because of your concern? If for example, AI does what it says on the box but causes goiters for prompt jockeys do you think the industry will stop then or offshore the role of AI jockey?

It's lovely that you care about health, but I have no idea why you think you are relevant to a society that is very much willing to risk extinction to avoid the slightest upset or delay to consumer convenience measured progress.

Forgeties79 12/4/2025||
> Do you think the industry will stop because of your concern?

I’m not sure what this question is addressing. I didn’t say it needs to “stop” or the industry has to respond to me.

> It's lovely that you care about health,

1) you should care too, 2) drop the patronizing tone if you are actually serious about having a conversation.

lambchoppers 12/4/2025||
From my PoV you are trolling with virtue signalling and thought terminating memes.. You don't want to discuss why every(?) technological introduction so far has ignored priorities such as your sentiments and any devil's adovocate must be the devil..

The members of HN are actually a pretty strongly biased sample towards people who get the omelet when the eggs get broken.

Forgeties79 12/4/2025|||
>and any devil's adovocate must be the devil.

No not the devil, but years ago I stopped finding it funny or useful when people "played" the part of devil's advocate because we all know that the vast majority of the time it's just a convenient way to be contrarian without ever being held accountable for the opinions espoused in the process. It also tends to distract people from the actual discussion at hand.

watwut 12/4/2025||||
People not being assholes and having opinions is not "trolling with virtue signaling". Even where people do virtue signal, it is significant improvement over "vice signaling" which you seem to be doing and expecting others to do.
adastra22 12/3/2025|||
I for one have no idea what you mean by health and safety with respect to AI. Do you have an OSHA concern?
Forgeties79 12/4/2025||
I have an “enabling suicidal ideation” concern for starters.

To be honest I’m kind of surprised I need to explain what this means so my guess is you’re just baiting/being opaque, but I’ll give you the benefit of the doubt and answer your question taken at face value: There have been plenty of high profile incidents in the news over the past year or two, as well as multiple behavioral health studies showing that we need to think critically about how these systems are deployed. If you are unable to find them I’ll locate them for you and link them, but I don’t want to get bogged down in “source wars.” So please look first (search “AI psychosis” to start) and then hit me up if you really can’t find anything.

I am not against the use of LLM’s, but like social media and other technologies before it, we need to actually think about the societal implications. We make this mistake time and time again.

pseudalopex 12/4/2025|||
> To be honest I’m kind of surprised I need to explain what this means so my guess is you’re just baiting/being opaque

Search for health and safety and see how many results are about work.

Forgeties79 12/4/2025||
You're being needlessly prescriptive with language here. I am taking about health and safety writ large. I don't appreciate the game you're playing and it's why these discussions rarely go anywhere. It can't all be flippant retorts and needling words. I am clearly saying that we need to as a society be willing to discuss the possible issues with LLM's and make informed decisions about how we want this technology to exist in our lives.

If you don't care about that so be it - just say it out loud then. But I do not feel like getting bogged down in justifying why we should even discuss it as we circle what this is really about.

adastra22 12/4/2025|||
All the Ai companies are taking those concerns seriously though. Every major chat service has guardrails in place that shutdown sessions which appear to be violating such content restrictions.

If your concerns are things like AI psychosis, then I think it is fair to say that the tradeoffs are not yet clear enough to call this. There are benefits and bad consequences for every new technology. Some are a net positive on the balance, others are not. If we outlawed every new technology because someone, somewhere was hurt, nothing would ever be approved for general use.

Forgeties79 12/4/2025|||
> All the Ai companies are taking those concerns seriously though.

I do not feel they are but also I was primarily talking about the AI-evangelists who shout people asking these questions down as Luddites.

adastra22 12/4/2025||
That's literally what the Luddites were doing though. It's a reasonable comparison.
Forgeties79 12/5/2025||
Luddite is usually used as an insult based on a misunderstanding of the Luddites. That’s the definition I’m responding to here.
adastra22 12/6/2025||
I would disagree. Luddite, to me, is a negative and pejorative label because history has shown Ned Ludd and his followers to have been a short-sighted, self-sabotaging reactionary movement.

I think the same thing of the precautionary movements today, including the AI skeptic position you are advocating for here. The comparison is valid, and it is negative and pejorative because history is on the side of advancing technology.

slashdave 12/4/2025|||
You mean "_Most_ people out of tech that write social media posts I read".
mips_avatar 12/3/2025|||
That’s fair. The bad behavior in the name of AI definitely isn’t limited to Seattle. I think the difference in SF is that there are people doing legitimately useful stuff with AI
_keats 12/3/2025||
I think this comment (and TFA) is really just painting with too broad of strokes. Of course there are going to be people in tech hubs that are very pro-AI, either because they are working with it directly and have had legitimately positive experiences or because they work with it and they begrudgingly see the writing on that wall for what it means for software professionals.

I can assure you, living in Seattle I still encounter a lot a AI boosters just as much as I encounter AI haters/skeptics

65 12/3/2025|||
Strangely I've found the only people who are super excited about AI are executive level boomers. My mom loves AI and uses it to do her job, which of course has poor results. All the younger people I know hate AI. Perhaps it's also a generational dofference.
ggerni 12/3/2025||
[flagged]
sirreal14 12/3/2025||
As a Seattle SWE, I'd say most of my coworkers do hate all the time-wasting AI stuff being shoved down our throats. There are a few evangelical AI boosters I do work with, but I keep catching mistakes in their code that they didn't used to make. Large suites of elegant looking unit tests, but the unit tests include large amounts of code duplicating functionality of the test framework for no reason, and I've even seen unit tests that mock the actual function under test. New features that actually already exist with more sane APIs. Code that is a tangled web of spaghetti. These people largely think AI is improving their speed but then their code isn't making it past code review. I worry about teams with less stringent code review cultures, modifying or improving these systems is going to be a major pain.
sudoshred 12/3/2025||
As someone on a team with a less stringent code review culture, AI generated code creates more work when used indiscriminately. Good enough to get approved but full of non-obvious errors that cause expensive rework which only gets prioritized once the shortcomings become painfully obvious (usually) months after the original work was “completed” and once the original author has forgotten the details, or worse, left the team entirely. Not to say AI generated code is not occasionally valuable, just not for anything that is intended to be correct and maintainable indefinitely by other developers. The real challenge is people using AI generated code as a mechanism to avoid fully understanding the problem that needs to be solved.
bn-l 12/4/2025||
Exactly it’s the non-obvious errors that are easy to miss—doubly so if you are just scanning the code. Those errors can create very hard to find bugs.

So between the debugging and many times you need to reprompt and redo (if you bother at all, but then that adds debugging time) is any time actually saved?

I think the dust hasn’t settled yet because no one has shipped mostly AI generated code for a non-trivial application. They couldn’t have with its current state. So it’s still unknown whether building on incredibly shaky ground will actually work in real life (I personally doubt it).

psyclobe 12/3/2025|||
> and I've even seen unit tests > that mock the actual function > under test.

Yup. Ai is so fickle it’ll do anything to accomplish the task. But ai is just a tool it’s all about what you allow it to do. Can’t blame ai really.

dpark 12/3/2025|||
In fairness I’ve seen humans make that mistake. We had a complete outage in the testing of a product once and a couple of tests were still green. Turns it they tested nothing and never had.
sirreal14 12/3/2025||
> In fairness I’ve seen humans make that mistake

These were (formerly) not the kinds of humans who regularly made these kinds of mistakes.

lbrito 12/3/2025||||
Leverage.

Those slops already existed, but AI scales them by an order of magnitude.

I guess the same can be said of any technology, but AI is just a more powerful tool overall. Using languages as an example - lets say duck typing allowed a 10% productivity boost, but also introduced 5% more mistakes/problems. AI (claims to) allow a 10x productivity boost, but also ~10x mistakes/problems.

insane_dreamer 12/4/2025||||
I've had Claude try to pull the same trick on me just yesterday. It will also try to cheat and apply a "fix" that just masks the real problem.
mehagar 12/3/2025|||
If a tool makes it easy to shoot yourself in the foot, then it's not a good tool. See C++.
jahsome 12/3/2025|||
I'm no apologist but this statement doesn't ring for me. It's easy to shock yourself with electricity, is it a bad tool?
crabmusket 12/4/2025||
Electricity isn't a tool, it's nature. An unenclosed electrical plug which you had to be really careful when handling would be a bad tool, yes.

A tool is something designed by humans. We don't get to design electricity, but we do get to design the systems we put in place around it.

Gravity isn't a tool, but stairs are, and there are good and bad stairs.

jahsome 12/9/2025|||
Appreciate the perspective.

Of course it's bad. It's new. But it won't always be either of those things. I think "bad" is relative assessment and based on a build-up of knowledge, often over decades.

Electrical plugs and stairs are "good" only because that knowledge has been discovered and has been regulated. Expecting a tool to be literally and metaphorically fool-proof immediately upon discovery strikes me as pretty disingenuous.

In the case of AI, the most anti-AI crowd are often vehement with their fingers in their ears saying "it's not good and never will be, and shouldn't exist." To be fair, the pro-AI crowd are often raving as if all the kinks had already been worked out.

Alex_L_Wood 12/4/2025|||
A knife then.
mr_toad 12/4/2025||||
Most tools are dangerous in the hands of the inept or the careless. Don’t run with scissors.
hudon 12/4/2025|||
A gun is a good tool easy to shoot yourself in the foot with
doyougnu 12/3/2025|||
I've interfaced with some AI generated code and after several examples of finding subtle and yet very wrong bugs I now find that I digest code that I suspect coming from AI (or an AI loving coworker) with much much more scrutiny than I used to. I've frankly lost trust in any kind of care for quality or due diligence from some coworkers.

I see how the use of AI is useful, but I feel that the practitioners of AI-as-coding-agent are running away from the real work. How can you tell me about the system you say you have created if you don't have the patience to make it or think about it deeply in the first place?

teej 12/3/2025|||
Your coworkers were probably writing subtle bugs before AI too.
munificent 12/3/2025|||
Would you rather consume a bowl of soup with a fly in it, or a 50 gallon drum with 1,000 flies in it? In which scenario are you more likely to fish out all the flies before you eat one?
teej 12/3/2025||
Easier to skim 1000 flies from a single drum than 100 flies from 100 bowls of soup.
munificent 12/3/2025|||
Alas, the flies are not floating on the surface. They are deeply mixed in, almost as if the machine that generated the soup wanted desperately to appear to be doing an excellent job making fly-free soup.
andrei_says_ 12/4/2025||
… while not having a real distinction between flies and non-fly ingredients.
Vegenoid 12/3/2025||||
No, I think it would be far easier to pick 100 flies each from a single bowl of soup than to pick all 1000 flies out of a 50 gallon drum.

You don’t get to fix bugs in code by simply pouring it through a filter.

solid_fuel 12/4/2025|||
I think the dynamic is different - before, they were writing and testing the functions and features as they went. Now, (some of) my coworkers just push a PR for the first or second thing copilot suggested. They generate code, test it once, it works that time, and then they ship it. So when I am looking through the PR it's effectively the _first_ time a human has actually looked over the suggested code.

Anecdote: In the 2 months after my org pushed copilot down to everyone the number of warnings in the codebase of our main project went from 2 to 65. I eventually cleaned those up and created a github action that rejects any PR if it emits new warnings, but it created a lot of pushback initially.

insin 12/4/2025||
Then when you've taken an hour to be the first person to understand how their code works from top to bottom and point out obvious bugs, problems and design improvements (no, I don't think this component needs 8 useEffects added to it which deal exclusively with global state that's only relevant 2 layers down, which are effectively treating React components like an event handling system for data - don't believe people who tell you LLMs are good at React, if you see a useEffect with an obvious LLM comment above it, it's likely to be buggy or unnecessary), your questions about it are answered with an immediate flurry of commits and it's back to square one.

Who are we speeding up, exactly?

solid_fuel 12/4/2025||
Yep, and if you're lucky they actually paste your comments back into the LLM. A lot of times it seems like they just prompted for some generic changes, and the next revision has tons of changes from the first draft. Your job basically becomes playing reviewer to someone else's interactions with an LLM.

It's about as productive as people who reply to questions with "ChatGPT says <...>" except they're getting paid to do it.

andrei_says_ 12/4/2025|||
I wonder if there’s a way to measure the cost of such code and associate it with the individuals incurring it. Unless this shows on reports, managers will continue believing LLMs are magic time saving machines writing perfect code.
nikkwong 12/4/2025|||
As another Seattle SWE, I'll go against the grain and say that I think AI is going to change the nature of the market for labor for SWE's and my guess would be for the negative. People need to remember that the ability of AI in code generation today is the worst that it ever will be, and it's only going to improve from here. If you were to just judge by the sentiment on HN, you would think no coder worth their weight was using this in the real world—but my experience on a few teams over the last two years has been exactly the opposite—people are often embarrassed to admit it but they are using it all the time. There are many engineers at Meta that "no longer code" by hand and do literally all of their problem solving with AI.

I remember last year or even earlier this year feeling like the models had plateau'd and I was of the mindset that these tools would probably forever just augment SWEs without fully replacing them. But with Opus 4.5, gemini 3, et al., these models are incredibly powerful and more and more SWEs are leaning on them more and more—a trend that may slow down or speed up—but is never going to backslide. I think people that don't generally see this are fooling themselves.

Sure, there are problem areas—it misses stuff, there are subtle bugs, it's not good for every codebase, for every language, for every scenario. There is some sloppiness that is hard to catch. But this is true with humans too. Just remember, the ability of the models today is the worst that it will ever be—it's only going to get better. And it doesn't need to be perfect to rapidly change the job market for SWE's—it's good enough to do enough of the tasks for enough mid-level SWEs at enough companies to reshape the market.

I'm sure I'll get downvoted to hell for this comment; but I think SWEs (and everyone else for that matter) would best practice some fiscal austerity amongst themselves because I would imagine the chance of many of us being on the losing side of this within the next decade is non-trivial. I mean, they've made all of the progress up to now in essentially the last 5 years and the models are already incredibly capable.

kevin948 12/4/2025|||
This has been exactly my mindset as well (another Seattle SWE/DS). The baseline capability has been improving and compounding, not getting worse. It'd actually be quite convenient if AI's capabilities stayed exactly where they are now; the real problems come if AI does work.

I'm extremely skeptical of the argument that this will end up creating jobs just like other technological advances did. I'm sure that will happen around the edges, but this is the first time thinking itself is being commodified, even if it's rudimentary in its current state. It feels very different from automating physical labor: most folks don't dream of working on an assembly line. But I'm not sure what's left if white collar work and creative work are automated en masse for "efficiency's" sake. Most folks like feeling like they're contributing towards something, despite some people who would rather do nothing.

To me it is clear that this is going to have negative effects on SWE and DS labor, and I'm unsure if I'll have a career in 5 years despite being a senior with a great track record. So, agreed. Save what you can.

hodgesrm 12/4/2025|||
> the real problems come if AI does work.

Exactly. For example, what happens to open source projects where developers don't have access to the latest proprietary dev tools? Or, what happens to projects like Spring if AI tools can generate framework code from scratch? I've seen maven builds on Java projects that pull in hundreds or even thousands of libraries. 99% of that code is never even used.

The real changes to jobs will be driven by considerations like these. Not saying this will happen but you can't rule it out either.

edit: Added last sentence.

didibus 12/4/2025||||
> It'd actually be quite convenient if AI's capabilities stayed exactly where they are now

That's what Im' crossing my fingers at, makes our job easier, but doesn't degrade our worth. It's the best possible outcome for devs.

pseudalopex 12/4/2025|||
> I'm extremely skeptical of the argument that this will end up creating jobs just like other technological advances did. I'm sure that will happen around the edges, but this is the first time thinking itself is being commodified, even if it's rudimentary in its current state. It feels very different from automating physical labor: most folks don't dream of working on an assembly line.

Most people do not dream of working most white collar jobs. Many people dream of meaningful physical labor. And many people who worked in mines did not dream of being told to learn to code.

kevin948 12/4/2025||
The important piece here is that many people want to contribute to something intellectually, and a huge pathway for that is at risk of being significantly eroded. Permanently.

Your point stands that many people like physical labor. Whether they want to artisanally craft something, or desire being outside/doing physical or even menial labor more than sitting in an office. True, but that doesn't solve the above issue, just like it didn't in reverse. Telling miners to learn to code was... not great. And from my perspective neither is outsourcing our thinking en masse to AI.

YZF 12/4/2025||||
I keep getting blown away by AI (specifically Claude Code with the latest models). What it does is literally science fiction. If you told someone 5 years ago that AI can find and fix a bug in some complex code with almost zero human intervention nobody would believe you, but this is the reality today. It can find bugs, it can fix bugs, it can refactor code, it can write code. Yes, not perfect, but with a well organized code base, and with careful prompting, it rivals humans in many tasks (certainly outperforms them in some aspects).

As you're also saying this is the worst it will ever be. There is only one direction, the question is the acceleration/velocity.

Where I'm not sure I agree is with the perception this automatically means we're all going to be out of a job. It's possible there would be more software engineering jobs. It's not clear. Someone still has to catch the bad approaches, the big mistakes, etc. There is going to be a lot more software produced with these tools than ever.

dragonwriter 12/4/2025||||
> Just remember, the ability of the models today is the worst that it will ever be—it's only going to get better.

This is the ultimate hypester’s motte to retreat to whenever the bailey of claimed utility of a technology falls. It’a trivially true of literally any technology, but also completely meaningless on its own.

throw234234234 12/4/2025||||
I think whether you are right or wrong it makes sense to hedge your bets. I suspect many people here are feeling some sense of fear (career, future implications, etc); I certainly do on some of these points and I think that's a rational response to be aware of the risk of the future unknown.

In general I think -> if I was not personally invested in this situation (i.e. another man on the street) what would be my immediate reaction to this? Would I still become a software engineer as an example? Even if it is doesn't come to past, given what I know now, would I take that bet with my life/career?

I think if people were honest with themselves sadly the answer for many would probably be "no". Most other professions wouldn't do this to themselves either; SWE is quite unique in this regard.

LaFolle 12/4/2025||||
> code generation today is the worst that it ever will be, and it's only going to improve from here.

I'm also of the mindset that even if this is not true, that is, even if current state of LLMs is best that it ever will be, AI still would be helpful. It is already great at writing self contained scripts, and efficiency with large codebases has already improved.

> I would imagine the chance of many of us being on the losing side of this within the next decade is non-trivial.

Yes, this is worrisome. Though its ironic that almost every serious software engineer at some point in time in their possibly early childhood / career when programming was more for fun than work, thought of how cool it would be for a computer program to write a computer program. And now when we have the capability, in front of our eyes, we're afraid of it.

But, one thing humans are really good at is adaptability. We adapt to circumstances / situation -- good or bad. Even if the worst happens, people loose jobs, for a short term it will be negatively impactful for the families, however, over a period of time, humans will adapt to the situation, adapt to coexist with AI, and find next endeavour to conquer.

Rejecting AI is not the solution. Using it as any other tool, is. A tool that, if used correctly, by the right person, can indeed produce faster results.

nikkwong 12/4/2025||
I mean, some are good at adaptability, while others get completely left in the dust. Look at the rust belt: jobs have left, and everyone there is desperate for a handout. Trump is busy trying to engineer a recession in the US—when recessions happen, companies at the margin go belly-up and the fat is trimmed from the workforce. With the inroads that AI is making into the workforce, it could be the first restructuring where we see massive losses in jobs.
didibus 12/4/2025||||
> I mean, they've made all of the progress up to now in essentially the last 5 years

I have to challenge this one, the research on natural language generation and machine learning dates back to the 50s, it just it only recently came together at scale in a way that became useful, but tons of the hardest progress was made over many decades, and very little innovation happened in the last 5 years. The innovation has mostly been bigger scale, better data, minor architectural tweaks, and reinforcement learning with human feedback and other such fine tuning.

nikkwong 12/4/2025||
We're definitely in the territory of splitting hairs; but I think most of what people call modern AI is the result of the transformer paper. Of course this was built off the back of decades of research.
Animats 12/4/2025||||
> People need to remember that the ability of AI in code generation today is the worst that it ever will be, and it's only going to improve from here.

I sure hope so. But until the hallucination problem is solved, there's still going to be a lot of toxic waste generated. We have got to get AI systems which know when they don't know something and don't try to fake it.

SideburnsOfDoom 12/4/2025||
The "hallucination problem" can't be solved, it's intrinsic to how stochastic text and image generators work. It's not a bug to be fixed, it's not some leak in the pipe somewhere, it is the whole pipe.

> there's still going to be a lot of toxic waste generated.

And how are LLMs going to get better as the quality of the training data nosedives because of this? Model collapse is a thing. You can easily see a scenario how they'll never be better than they are now.

zeroonetwothree 12/4/2025|||
> People need to remember that the ability of AI in code generation today is the worst that it ever will be

I've been reading this since 2023 and yet it hasn't really improved all that much. The same things are still problems that were problems back then. And if anything the improvement is slowing down, not speeding up.

I suspect unless we have real AGI we won't have human-level coding from AIs.

int_19h 12/4/2025||
It has improved drastically, as evident by the kinds of issues these things can do with minimal supervision now.
openasocket 12/4/2025|||
Pretty much. Someone on our team put out a code review for some new feature and then bounced for a 2 week vacation. One of our junior engineers approved it. Despite the fact that it was in a section of dead code that wasn’t supposed to even be enabled yet, it managed to break our test environment. Took senior engineers a day to figure out how that was even possible before reverting. We had another couple engineers take a look to see what needs to be done to fix the bug. All of them came away with the conclusion that it was 1,000 lines of pure AI-generated slop with no redeemable value. Trying to fix it would take more work than just re-implenting from scratch.
justatdotin 12/4/2025||
> One of our junior engineers approved it.

pretty sure the process I've seen most places is more like: one junior approves, one senior approves, then the owner manually merges.

so your process seems inadequate to me, agents or not.

also, was it tagged as generated? that seems like an obvious safety feature. As a junior, I might be thinking: 'my senior colleague sure knows lots of this stuff', but all it would take to dispel my illusion is an agent tag on the PR.

openasocket 12/4/2025||
> pretty sure the process I've seen most places is more like: one junior approves, one senior approves, then the owner manually merges.

Yeah that’s what I think we need to enforce. To answer your question, it was not tagged as AI generated. Frankly, I think we should ban AI-generated code outright, though labeling it as such would be a good compromise.

mips_avatar 12/3/2025|||
My hot take is that the evangelical people don't really like AI either they're just scared. I think you have to be outside of big tech to appreciate AI
elzbardico 12/3/2025||
If AI replaces software engineers, people outside tech doesn't have much chance of surviving it too.
LPisGood 12/4/2025||
Exactly. I think it’s pretty clear that software engineering is an “intelligence complete” problem. If you can automatically solve SWE than you can automatically solve pretty much all knowledge work.
elzbardico 12/4/2025||
A lot of modern corporate work is bullshit work.

I don't think it is too outrageous to believe that LLMs can do a lot of what all those armies of corporate bureaucrats do.

int_19h 12/4/2025||
The difference is that unlike SWEs, the people doing all that bullshit work are much better at networking, so they will (collectively) find a reason why they shouldn't be replaced with AI and push it through.

SWEs could do so as well, if only we were unionized.

elzbardico 12/5/2025||
Gotta agree that you have a point here, a pretty strong one indeed.
jfalcon 12/3/2025||
I see it like the hype of js/node and whatever module tech is glued to it when it was new from the perspective of someone who didn't code js. Sum of F's given is still zero.

-206dev

thorum 12/3/2025||
People hate what the corporations want AI to be and people hate when AI is used the way corporations seem to think it should be used, because the executives at these companies have no taste and no vision for the future of being human. And that is what people think of when they hear “AI”.

I still think there’s a third path, one that makes people’s lives better with thoughtful, respectful, and human-first use of AI. But for some reason there aren’t many people working on that.

sph 12/4/2025|
> I still think there’s a third path, one that makes people’s lives better with thoughtful, respectful, and human-first use of AI. But for some reason there aren’t many people working on that.

I am thinking about this third path a lot, but the reality is that it wouldn't make AI more interesting than any other tool humans use to go about their daily lives. Does one get this obsessed about screwdrivers?

The issue is that a human-first world where technology is subservient to our needs is very incompatible with our current society. The AI hype is part and parcel of the capitalistic mode of production where humans are ultimately judged for their ability to produce more commodities; the goal has always been to improve productivity to make more for cheaper — this time, the quest for efficiency has found a viable replacement for many humans activities.

pkasting 12/3/2025||
Ex-Google here; there are many people both current and past-Google that feel the same way as the composite coworker in the linked post.

I haven't escaped this mindset myself. I'm convinced there are a small number of places where LLMs make truly effective tools (see: generation of "must be plausible, need not be accurate" data, e.g. concept art or crowd animations in movies), a large number of places where LLMs make apparently-effective tools that have negative long-term consequences (see: anything involving learning a new skill, anything where correctness is critical), and a large number of places where LLMs are simply ineffective from the get-go but will increasingly be rammed down consumers' throats.

Accordingly I tend to be overly skeptical of AI proponents and anything touching AI. It would be nice if I was more rational, but I'm not; I want everyone working on AI and making money from AI to crash and burn hard. (See also: cryptocurrency)

CSMastermind 12/3/2025||
My friends at Google are some of the most negative about the potential of AI to improve software development. I was always surprised by this and assumed internally at Google would be one of the first places to adopt these.
crystal_revenge 12/4/2025|||
I've generally found an inverse correlation between "understands AI" and "exuberance for AI".

I'm the only person at my current company who has had experience at multiple AI companies (the rest have never worked on it in a production environment, one of our projects is literally something I got paid to deliver customers at another startup), has written professionally about the topic, and worked directly with some big names in the space. Unsurprisingly, I have nothing to do with any of our AI efforts.

One of the members of our leadership team, who I don't believe understands matrix multiplication, genuinely believes he's about to transcend human identity by merging with AI. He's publicly discussed how hard it is to maintain friendship with normal humans who can't keep up.

Now I absolutely think AI is useful, but these people don't want AI to be useful they want it to be something that anyone who understands it knows it can't be.

It's getting to the point where I genuinely feel I'm witnessing some sort of mass hysteria event. I keep getting introduced to people who have almost no understanding of the fundamentals of how LLMs work who have the most radically fantastic ideas about what they are capable of on a level I have ever experienced in my fairly long technical career.

adamisom 12/4/2025|||
Personally, I don't understand how LLMs work. I know some ML math and certainly could learn, and probably will, soon.

But my opinions about what LLMs can do are based on... what LLMs can do. What I can see them doing. With my eyes.

The right answer to the question "What can LLMs do?" is... looking... at what LLMs can do.

crystal_revenge 12/4/2025|||
I'm sure you're already familiar with the ELIZA effect [0], but you should be a bit skeptical of what you are seeing with your eyes, especially when it comes to language. Humans have an incredible weakness to be tricked by language.

You should be doubly skeptically ever since RLHF has become standard as the model has literally been optimized to give you answers you find most pleasing.

The best way to measure of course is with evaluations, and I have done professional LLM model evaluation work for about 2 years. I've seen (and written) tons of evals and they both impress me and inform my skepticism about the limitations of LLMs. I've also seen countless times where people are convinced "with their eyes" they've found a prompt trick that improves the results, only to be shown that this doesn't pan out when run on a full eval suite.

As an aside: What's fascinating is that it seems our visual system is much more skeptical, an eyeball being slightly off created by a diffusion model will immediately set off alarms where enough clever word play from an LLM will make us drop our guard.

0. https://en.wikipedia.org/wiki/ELIZA_effect

heurist 12/4/2025|||
We get around this a bit when using it to write code since we have unit tests and can verify that it's making correct changes and adhering to an architecture. It has truly become much more capable in the last year. This technology is so flexible that it can be used in ways no eval will ever touch and still perform well. You can't just rely on what the labs say about it, you have to USE it.
RealityVoid 12/4/2025|||
Interesting observation about the visual system. Truth be told, we get the visual feedback about the world at a much higher data rat AND the visual about the world is usually much higher correlated with reality, whereas the language is a virtual byproduct of cognition and communication.
ACCount37 12/4/2025||||
No one understands how LLMs work. But some people manage to delude themselves into thinking that they do.

One key thing that people prefer not to think about is that LLMs aren't created by humans. They are created by an inhuman optimization algorithm that humans have learned to invoke and feed with data and computation.

Humans have a say in what it does and how, but "a say" is about the extent of it. The rest is a black box - incomprehensible products of a poorly understood mathematical process. The kind of thing you have to research just to get some small glimpses of how it does what it does.

Expecting those humans to understand how LLMs work is a bit like expecting a woman to know how humans work because she made a human once.

woah 12/4/2025|||
Bro- do you even matrix multiply?
monch1962 12/4/2025||||
Spot on in my experience.

I work in a space where I get to build and optimise AI tools for my own and my team's use pretty much daily. As such I focus mainly on AI'ing the crap out of boring & time-consuming stuff that doesn't interest any of us any more, and luckily enough there's a whole lot of low hanging fruit in that space where AI is a genuine time, cost and sanity saver.

However any activity that requires directed conscious thought and decision making where the end state isn't clearly definable up front tends to be really difficult for AI. So much of that work relies on a level of intuition and knowledge that is very hard to explain to a layman - let alone eidetic idiots like most AIs.

One example is trying to get AI to identify security IT incidents in real time and take proactive action. Skilled practitioners can fairly easily use AI to detect anomalous events in near real time, but getting AI to take the next step to work out which combinations of "anomalous" activities equate to "likely security incident" is much harder. A reasonably competent human can usually do that relatively quickly, but often can't explain how they do it.

Working out what action is appropriate once the "likely security incident" has been identified is another task that a reasonably competent human can do, but where AIs are hopeless. In most cases, a competent human is WAAAY better at identifying a reasonable way forward based on insufficient knowledge. In those cases, a good decision made quickly is preferable to a perfect decision made slowly, and humans understand this fairly intuitively.

throwaway-0001 12/4/2025||||
I think there is a correlation between when you can you expect from something when I know their internals vs someone that doesn’t know but is not like who knows internals is much much better.

Example: many people created websites without a clue of how they really work. And got millions of people on it. Or had crazy ideas to do things with them.

At the same time there are devs that know how internals work but can’t get 1 user.

pc manufacturers never were able to even imagine what random people were able to do with their pc.

This to say that even if you know internals you can claim you know better, but doesn’t mean it’s absolute.

Sometimes knowing the fundamentals it’s a limitation. Will limit your imagination.

crystal_revenge 12/4/2025|||
I'm a big fan of the concept of 初心 (Japanese: Shoshin aka "beginners mind" [0] ) and largely agree with Sazuki's famous quote:

> “In the beginner’s mind there are many possibilities, but in the expert’s there are few”

Experts do tend to be limited in what they see as possible. But I don't think that allows carte blanche belief that a fancy Markov Chain will let you transcend humanity. I would argue one of the key concepts of "beginners mind" is not radical assurance in what's possible but unbounded curiosity and willingness to explore with an open mind. Right now we see this in the Stable Diffusion community: there are tons of people who also don't understand matrix multiplication that are doing incredible work through pure experimentation. There's a huge gap between "I wonder what will happen if I just mix these models together" and "we're just a few years from surrendering our will to AI". None of the people I'm concerned about have what I would consider an "open mind" about the topic of AI. They are sure of what they know and to disagree is to invite complete rejection. Hardly a principle of beginners mind.

Additionally:

> pc manufacturers never were able to even imagine what random people were able to do with their pc.

Belies a deep ignorance of the history of personal computing. Honestly, I don't think modern computing has still ever returned to the ambition of what was being dreampt up, by experts, at Xerox PARC. The demos on the Xerox Alto in the early 1970s are still ambitious in some senses. And, as much as I'm not a huge fan, Gates and Jobs absolutely had grand visions for what the PC would be.

0. https://en.wikipedia.org/wiki/Shoshin

musebox35 12/4/2025|||
I think this is what is blunted by mass education and most textbooks. We need to discover it again if we want to enjoy our profession with all the signals flowing from social media about all the great things other people are achieving. Staying stupid and hungry really helps.
cindyllm 12/4/2025|||
[dead]
musebox35 12/4/2025|||
I think this is more about mechanistic understanding vs fundamental insight kind of situation. The linear algebra picture is currently very mechanistic since it only tells us what the computations are. There are research groups trying to go beyond that but the insight from these efforts are currently very limited. However, the probabilistic view is very much clearer. You can have many explorable insights, both potentially true and false, by jıst understanding the loss functions, what the model is sampling from, what is the marginal or conditional distributions are and so on. Generative AI models are beautiful at that level. It is truly mind blowing that in 2025, we are able to sample from the megapixel image distributions conditioned on the NLP text prompts.
throwaway-0001 12/4/2025||
If were true then people could predict this AI many years ago
musebox35 12/4/2025||
If you dig ml/vision papers from old, you will see that formulation-wise they actually did, but they lacked the data, compute, and the mechanistic machinery provided by the transformer architecture. The wheels of progress are slow and requires many rotations to finally reach somewhere.
yumraj 12/4/2025||||
> I've generally found an inverse correlation between "understands AI" and "exuberance for AI".

Few years ago I had this exact observation regarding self driving cars. Non/semi engineers who worked in the tech industry were very bullish about self driving cars, believing every and ETA spewed by Musk, engineers were cautious optimistically or pessimistically depending on their understanding of AI, LiDAR, etc.

dreamcompiler 12/4/2025||||
This completely explains why so many engineers are skeptical of AI while so many managers embrace it: The engineers are the ones who understand it.

(BTW, if you're an engineer who thinks you don't understand AI or are not qualified to work on it, think again. It's just linear algebra, and linear algebra is not that hard. Once you spend a day studying it, you'll think "Is that all there is to it?" The only difficult part of AI is learning PyTorch, since all the AI papers are written in terms of Python nowadays instead of -- you know -- math.)

I've been building neural net systems since the late 1980s. And yes they work and they do useful things when you have modern amounts of compute available, but they are not the second coming of $DEITY.

LTL_FTC 12/4/2025|||
Linear algebra cannot be learned in a day. Maybe multiplying matrices when the dimensions allow but there is far more to linear algebra than knowing how to multiply matrices. Knowing when and why is far more interesting. Knowing how to decompose them. Knowing what a non-singular matrix is and why it’s special and so on. Once you know what’s found in a basic lower devision linear algebra class, one can move it linear programming and learn about cost functions and optimization or numerical analysis. PyTorch is just a calculator. If I handed someone a Ti-84 they wouldn’t magically know how to bust out statistics on it…
tolciho 12/4/2025|||
> This completely explains why so many engineers are skeptical of AI while so many managers embrace it: The engineers are the ones who understand it.

Curiously some Feynman chap reported that several NASA engineers put the chance of the Challenger going kablooie—an untechnical term for rapid unscheduled deconstruction, which the Challenger had then just recently exhibited—at 1 in 200, or so, while the manager said, after some prevarications—"weaseled" is Feynman's term—that the chance was 1 in 100,000 with 100% confidence.

nilkn 12/4/2025||||
I mostly disagree with this. Lots of things correlate weakly with other things, often in confusing and overlapping ways. For instance, expertise can also correlate with resistance to change. Ego can correlate with protection of the status quo and dismissal of people who don't have the "right" credentials. Love of craft can correlate with distaste for automation of said craft (regardless of the effectiveness of the automation). Threat to personal financial stability can correlate with resistance (regardless of technical merit). Potential for personal profit can correlate with support (regardless of technical merit). Understanding neural nets can correlate both with exuberance and skepticism in slightly different populations.

Correlations are interesting but when examined only individually they are not nearly as meaningful as they might seem. Which one you latch onto as "the truth" probably says more about what tribe you value or want to be part of than anything fundamental about technology or society or people in general.

safety1st 12/4/2025|||
It's definitely interesting to look at people's mental models around AI.

I don't know shit about the math that makes it work, but my mental model is basically - "A LLM is an additional tool in my toolbox which performs summarization, classification and text transformation tasks for me imperfectly, but overall pretty well."

Probably lots of flaws in that model but I just try to think like an engineer who's attempting to get a job done and staying up to date on his tooling.

But as you say there are people who have been fooled by the "AI" angle of all this, and they think they're witnessing the birth of a machine god or something. The example that really makes me throw up my hands is r/MyBoyfriendIsAI where you have women agreeing to marry the LLM and other nonsense that is unfathomable to the mentally well.

There's always been a subset of humans who believe unimaginably stupid things, like that there's a guy in the sky who throws lightning bolts when he's angry, or whatever. The interesting (as in frightening) trend in modernity is that instead of these moron cults forming around natural phenomena we're increasingly forming them around things that are human made. Sometimes we form them around the state and human leaders, increasingly we're forming them around technologies, in line with Arthur C. Clarke's third law - that "Any sufficiently advanced technology is indistinguishable from magic."

If I sound harsh it's because I am, we don't want these moron cults to win, the outcome would be terrible, some high tech version of the Dark Ages. Yet at this moment we have business and political leaders and countless run-of-the-mill tech world grifters who are leaning into the moron cult version of AI rather than encouraging people to just see it as another tool in the box.

yoyohello13 12/3/2025||||
Google has good engineers. Generally I've noticed the better someone is at coding the more critical they are of AI generated code. Which make sense honestly. It's easier to spot flaws the more expert you are. This doesn't mean they don't use AI gen code, just they are more careful with when an where.
venturecruelty 12/3/2025|||
Yes, because they're more likely to understand that the computer isn't this magical black box, and that just because we've made ELIZA marginally better, doesn't mean it's actually good. Anecdata, but the people I've seen be dazzled by AI the most are people with little to no programming experience. They're also the ones most likely to look on computer experts with disdain.
josephg 12/3/2025|||
Well yeah. And because when an expert looks at the code chatgpt produces, the flaws are more obvious. It programs with the skill of the median programmer on GitHub. For beginners and people who do cookie cutter work, this can be incredible because it writes the same or better code they could write, fast and for free. But for experts, the code it produces is consistently worse than what we can do. At best my pride demands I fix all its flaws before shipping. More commonly, it’s a waste of time to ask it to help, and I need to code the solution from scratch myself anyway.

I use it for throwaway prototypes and demos. And whenever I’m thrust into a language I don’t know that well, or to help me debug weird issues outside my area of expertise. But when I go deep on a problem, it’s often worse than useless.

ethbr1 12/3/2025|||
This is why AI is the perfect management Rorschach test.

To management (out of IC roles for long enough to lose their technical expertise), it looks perfect!

To ICs, the flaws are apparent!

So inevitably management greenlights new AI projects* and behaviors, and then everyone is in the 'This was my idea, so it can't fail' CYA scenario.

* Add in a dash of management consulting advice here, and note that management consultants' core product was already literally 'something that looks plausible enough to make execs spend money on it'

torginus 12/4/2025||||
In my experience (with ChatGPT 5.1 as of late) is that the AI follows a problem->solution internal logic and doesn't think and try to structure its code.

If you ask for an endpoint to a CRUD API, it'll make one. If you ask for 5, it'll repeat the same code 5 times and modify it for the use case.

A dev wouldn't do this, they would try to figure out the common parts of code, pull them out into helpers, and try to make as little duplicated code as possible.

I feel like the AI has a strong bias towards adding things, and not removing them. The most obviously wrong thing is with CSS - when I try to do some styling, it gets 90% of the way there, but there's almost always something that's not quite right.

Then I tell the AI to fix a style, since that div is getting clipped or not correctly centered etc.

It almost always keeps adding properties, and after 2-3 tries and an incredibly bloated style, I delete the thing and take a step back and think logically about how to properly lay this out with flexbox.

bigblind 12/4/2025|||
> If you ask for an endpoint to a CRUD API, it'll make one. If you ask for 5, it'll repeat the same code 5 times and modify it for the use case. > >A dev wouldn't do this, they would try to figure out the common parts of code, pull them out into helpers, and try to make as little duplicated code as possible. > >I feel like the AI has a strong bias towards adding things, and not removing them.

I suspect this is because an LLM doesn't build a mental model of the code base like a dev does. It can decide to look at certain files, and maybe you can improve this by putting a broad architecture overview of a system in an agents.md file, I don't have much experience with that.

But for now, I'm finding it most useful still think in terms of code architecture, and give it small steps that are part of that architecture, and then iterate based on your own review of AI generated code. I don't have the confidence in it to just let some agent plan, and then run for tens of minutes or even hours building out a feature. I want to be in the loop earlier to set the direction.

BenkaiDebussy 12/11/2025||||
I've noticed this as well, though I've also noticed that you can sometimes avoid it if you're more explicit and actually say things like "can you write these endpoints in a way that ___, ___, and ____." Or I'll mention some context that I'm worried the LLM will miss (for example pointing out when there are already existing functions for doing certain things).

The broader a request is, the more likely I am to get a bunch of bloat. I think this is partly because the LLM will always try to fully solve the problem entirely from your initial prompt. Instead of stopping to clarify something, it'll just move forward with doing something that technically works. I find it's better to break things into smaller steps so that you can "intervene" if it starts to do things wrong.

babyshake 12/4/2025||||
A good system prompt goes a long way with the latest models. Even just something as simple as "use DRY principles whenever possible." or prompting a plan-implement-evaluate cycle gets pretty good results, at least for tasks that are doing things that AI is well trained on like CRUD APIs.
skissane 12/4/2025||||
> If you ask for an endpoint to a CRUD API, it'll make one. If you ask for 5, it'll repeat the same code 5 times and modify it for the use case.

I don’t think this is an inherent issue to the technology. Duplicate code detectors have been around for ages. Given an AI agent a tool which calls one, and ask it to reduce duplication, it will start refactoring.

Of course, there is a risk of going too far in the other direction-refactorings which technically reduce duplication but which have unacceptable costs (you can be too DRY). But some possible solutions: (a) ask it to judge if the refactoring is worth it or not - if it judges no, just ignore the duplication and move on; (b) get a human to review the decision in (a); (c) if AI repeatedly makes wrong decision (according to human), prompt engineering, or maybe even just some hardcoded heuristics

Turskarama 12/4/2025|||
It actually is somewhat a limit of the technology. LLMs can't go back and modify their own output, later tokens are always dependent on earlier tokens and they can't do anything out of order. "Thinking" helps somewhat by allowing some iteration before they give the user actual output, but that requires them to write it the long way and THEN refactor it without being asked, which is both very expensive and something they have to recognize the user wants.
skissane 12/4/2025||
Coding agents can edit their own output - because their output is tool calls to read and write files, and so it can write a file, run some check on it, modify the file to try to make it pass, run the check again, etc
dasil003 12/4/2025|||
Sorry but from where I sit, this is only marginally closes gap from AI to truly senior engineers.

Basically human junior engineers start by writing code in a very procedural and literal style with duplicate logic all over the place because that's the first step in adapting human intelligence to learning how to program. Then the programmer realizes this leads to things becoming unmaintainable and so they start to learn the abstraction techniques of functions, etc. An LLM doesn't have to learn any of that, because they already know all languages and mechanical technique in their corpus, so this beginning journey never applies.

But what the junior programmer has that the LLM doesn't, is an innate common sense understanding of human goals that are driving the creation of the code to begin with, and that serves them through their entire progression from junior to senior. As you point out, code can be "too DRY", but why? Senior engineers understand that DRYing up code is not a style issue, its more about maintainability and understanding what is likely to change, and what will be the apparent effects to human stakeholders who depend on the software. Basically do these things map to things that are conceptually the same for human users and are unlikely to diverge in the future. This is also a surprisingly deep question as perhaps every human stakeholder will swear up and down they are the same, but nevertheless 6 months from now a problem arises that requires them to diverge. At this point there is now a cognitive overhead and dissonance of explaining that divergence of the users who were heretofore perfectly satisfied with one domain concept.

Ultimately the value function for success of a specific code factoring style depends on a lot of implicit context and assumptions that are baked into the heads of various stakeholders for the specific use case and can change based on myriad outside factors that are not visible to an LLM. Senior engineers understand the map is not the territory, for LLMs there is no territory.

skissane 12/4/2025||
I’m not suggesting AIs can replace senior engineers (I don’t want to be replaced!)

But, senior engineers can supervise the AI, notice when it makes suboptimal decisions, intervene to address that somehow (by editing prompts or providing new tools)… and the idea is gradually the AI will do better.

Rather than replacing engineers with AIs, engineers can use AIs to deliver more in the same amount of time

torginus 12/4/2025|||
Which I think points out the biggest issue with current AI - knowledge workers in any profession at any skill level tend to get the impression that AI is very impressive, but is prone to fail at real world tasks unpredictably, thus the mental model of 'junior engineer' or any human that does its simple tasks by itself reliably, is wrong.

AI operating at all levels needs to be constantly supervised.

Which would still make AI a worthwhile technology, as a tool, as many have remarked before me.

The problem is, companies are pushing for agentic AI instead of one that can do repetitve, short horizon tasks in a fast and reliable manner.

dasil003 12/4/2025|||
Sure. My point was AI was already 25% of the way there even with their verbose messy style. I think with your suggestions (style guidance, human in the loop, etc) we get at most 30% of the way there.
BurningFrog 12/4/2025|||
Bad code is only really bad if it needs to be maintained.

If your AI reliably generates working code from a detailed prompt, the prompt is now the source that needs to be maintained. There is no important reason to even look at the generated code

anon7725 12/4/2025|||
> the prompt is now the source that needs to be maintained

The inference response to the prompt is not deterministic. In fact, it’s probably chaotic since small changes to the prompt can produce large changes to the inference.

CamperBob2 12/4/2025||
The inference response to the prompt is not deterministic.

So? Nobody cares.

Is the output of your C compiler the same every time you run it? How about your FPGA synthesis tool? Is that deterministic? Are you sure?

What difference does it make, as long as the code works?

antiloper 12/4/2025|||
> Is the output of your C compiler the same every time you run it?

Yes? Because of actual engineering mind you and not rolling the dice until the lucky number comes up.

https://reproducibility.nixos.social/evaluations/2/2d293cbfa...

CamperBob2 12/4/2025||
It's not true for a place-and-route engine, so why does it have to be true for a C compiler?

Nobody else cares. If you do, that's great, I guess... but you'll be outcompeted by people who don't.

antiloper 12/4/2025||
I'm glad you asked! https://reproducible-builds.org/
CamperBob2 12/4/2025||
That's an advertisement, not an answer.
antiloper 12/4/2025||
Did you really read and understand this page in the 1 minute between my post and your reply or did you write a dismissive answer immediately?
CamperBob2 12/4/2025||
Eh, I'll get an LLM to give me a summary later.

In the meantime: no, deterministic code generation isn't necessary, and anyone who says it is is wrong.

josephg 12/4/2025||||
The C compiler will still make working programs every time, so long as your code isn’t broken. But sometimes the code chatgpt produces won’t work. Or it'll kinda work but you’ll get weird, different bugs each time you generate it. No thanks.
CamperBob2 12/4/2025||
Nothing matters but d/dt. It's so much better than it was a year ago, it's not even funny.

How weird would it be if something like this worked perfectly out of the box, with no need for further improvement and refinement?

Capricorn2481 12/4/2025|||
> So? Nobody cares

Yeah the business surely won't care when we rerun the prompt and the server works completely differently.

> Is the output of your C compiler the same every time you run it

I've never, in my life, had a compiler generate instructions that do something completely different from what my code specifies.

That you would suggest we will reach a level where an English language prompt will give us deterministic output is just evidence you've drank the kool-aid. It's just not possible. We have code because we need to be that specific, so the business can actually be reliable. If we could be less specific, we would have done that before AI. We have tried this with no code tools. Adding randomness is not going to help.

CamperBob2 12/4/2025||
I've never, in my life, had a compiler generate instructions that do something completely different from what my code specifies.

Nobody is saying it should. Determinism is not a requirement for this. There are an infinite number of ways to write a program that behaves according to a given spec. This is equally true whether you are writing the source code, an LLM is writing the source code, or a compiler is generating the object code.

All that matters is that the program's requirements are met without undesired side effects. Again, this condition does not require deterministic behavior on the author's part or the compiler's.

To the extent it does require determinism, the program was poorly- or incompletely-specified.

That you would suggest we will reach a level where an English language prompt will give us deterministic output is just evidence you've drank the kool-aid.

No, it's evidence that you're arguing with a point that wasn't made at all, or that was made by somebody else.

Capricorn2481 12/5/2025||
You're on the wrong axis. You have to be deterministic about following the spec, or it's a BUG in the compiler. Whether or not you actually have the exact same instructions, a compiler will always do what the code says or it's bugged.

LLMs do not and cannot follow the spec of English reliably, because English is open to interpretation, and that's a feature. It makes LLMs good at some tasks, but terrible for what you're suggesting. And it's weird because you have to ignore the good things about LLMs to believe what you wrote.

> There are an infinite number of ways to write a program that behaves according to a given spec

You're arguing for more abstractions on top of an already leaky abstraction. English is not an appropriate spec. You can write 50 pages of what an app should do and somebody will get it wrong. It's good for ballparking what an app should do, and LLMs can make that part faster, but it's not good for reliably plugging into your business. We don't write vars, loops, and ifs for no reason. We do it because, at the end of the day, an English spec is meaningless until someone actually encodes it into rules.

The idea that this will be AI, and we will enjoy the same reliability we get with compilers, is absurd. It's also not even a conversation worth having when LLMs hallucinate basic linux commands.

CamperBob2 12/5/2025|||
People are betting trillions that you're the one who's "on the wrong axis." Seems that if you're that confident, there's money to be made on the other side of the market, right? Got any tips?

Essentially all of the drawbacks to LLMs you're mentioning are either already obsolete or almost so, or are solvable by the usual philosopher's stone in engineering: negative feedback. In this case, feedback from carefully-structured tests. Safe to say that we'll spend more time writing tests and less time writing original code going forward.

josephg 12/6/2025||
> People are betting trillions that you're the one who's "on the wrong axis."

People are betting trillions of dollars that AI agents will do a lot of useful economic work in 10 years. But if you take the best LLMs in the world, and ask them to make a working operating system, C compiler or web browser, they fail spectacularly.

The insane investment in AI isn't because today's agents can reliably write software better than senior developers. The investment is a bet that they'll be able to reliably solve some set of useful problems tomorrow. We don't know which problems they'll be able to reliably solve, or when. They're already doing some useful economic work. And AI agents will probably keep getting smarter over time. Thats all we know.

Maybe in a few years LLMs will be reliable enough to do what you're proposing. But neither I - nor most people in this thread - think they're there yet. If you think we're wrong, prove us wrong with code. Get ChatGPT - or whichever model you like - to actually do what you're suggesting. Nobody is stopping you.

CamperBob2 12/6/2025||
Get ChatGPT - or whichever model you like - to actually do what you're suggesting. Nobody is stopping you.

I do, all the time.

But if you take the best LLMs in the world, and ask them to make a working operating system, C compiler or web browser, they fail spectacularly.

Like almost any powerful tool, there are a few good ways to use LLM technology and countless bad ways. What kind of moron would expect "Write an operating system" or "Write a compiler" or "Write a web browser" to yield anything but plagiarized garbage? A high-quality program starts with a high-quality specification, same as always. Or at least with carefully-considered intent.

The difference is, given a sufficiently high-quality specification, an LLM can handle the specification->source step, just as a compiler or assembler relieves you of having to micromanage the source->object code step.

IMHO, the way it will shake out is that LLMs as we know them today will be only components, perhaps relatively small ones, of larger systems that translate human intent to machine function. What we call "programming" today is only one implementation of a larger abstraction.

AlexandrB 12/4/2025||||
I think this might be plausible in the future, but it needs a lot more tooling. For starters you need to be able to run the prompt through the exact same model so you can reproduce a "build".
josephg 12/4/2025|||
Even the exact same model isn't enough. There are several sources of nondeterminism in LLMs. These would all need to be squashed or seeded - which as far as I know isn't a feature that openai / anthropic / etc provide.
BurningFrog 12/4/2025|||
OK, then the current models aren't as good as I thought/hoped.

I guess one thing it means is that we still need extensive test suites. I suppose an LLM can write those too.

josephg 12/4/2025|||
Well.. except the AI models are nondeterministic. If you ask an AI the same prompt 20 times, you'll get 20 different answers. Some of them might work, some probably won't. It usually takes a human to tell which are which and fix problems & refactor. If you keep the prompt, you can't manually modify the generated code afterwards (since it'll be regenerated). Even if you get the AI to write all the code correctly, there's no guarantee it'll do the same thing next time.
panarky 12/3/2025||||
> It programs with the skill of the median programmer on GitHub

This is a common intuition but it's provably false.

The fact that LLMs are trained on a corpus does not mean their output represents the median skill level of the corpus.

Eighteen months ago GPT-4 was outperforming 85% of human participants in coding contests. And people who participate in coding contests are already well above the median skill level on Github.

And capability has gone way up in the last 18 months.

usefulcat 12/4/2025|||
The best argument I've yet heard against the effectiveness of AI tools for SW dev is the absence of an explosion of shovelware over the past 1-2 years.

https://mikelovesrobots.substack.com/p/wheres-the-shovelware...

Basically, if the tools are even half as good as some proponents claim, wouldn't you expect at least a significant increase in simple games on Steam or apps in app stores over that time frame? But we're not seeing that.

aydyn 12/4/2025|||
Are you sure we aren't seeing an increase in steam games?

Charts I'm looking at show a mild exponential around 2024 https://www.statista.com/statistics/552623/number-games-rele...

Also theres probably a bottleneck in manual review time.

raw_anon_1111 12/4/2025||||
The shovelware is the companies getting funded…

https://docs.google.com/spreadsheets/d/1Uy2aWoeRZopMIaXXxY2E...

The shovelware software is coming…

Miraste 12/4/2025||||
Interesting approach. I can think of one more explanation the author didn't consider: what if software development time wasn't the bottleneck to what he analyzed? The chart for Google Play app submissions, for example, goes down because Google made it much more difficult to publish apps on their store in ways unrelated to software quality. In that case, it wouldn't matter whether AI tools could write a billion production-ready apps, because the limiting factor is Google's submission requirements.
parineum 12/4/2025||
There are other charts besides Google play. Particularly insightful id the steam chart as steam is already full of shovelware and, in my experience, many developers wish they were making games but the pay is bad.

GitHub repos is pretty interesting too but it could be that people just aren't committing this stuff. Showing zero increase is unexpected though.

falkensmaize 12/4/2025||||
I've had this same thought for some time. There should have been an explosion in startups, new product from established companies, new apps by the dozen every day. If LLMs can now reliably turn an idea into an application, where are they?
babyshake 12/4/2025|||
There is a deluge, every day. Just nobody notices or uses them.
DANmode 12/4/2025|||
Still figuring out if they’re adding value to 200 customers or not.
rrrx3 12/4/2025||||
The argument against this is that shovelware has a distinctly different distribution model now.

App stores have quality hurdles that didn’t exist in the diskette days. The types of people making low quality software now can self publish (and in fact do, often), but they get drowned out by established big dogs or the ever-shifting firehose of our social zeitgeist if you are not where they are.

Anyone who has been on Reddit this year in any software adjacent sub has seen hundreds (at minimum) of posts about “feedback on my app” or slop posts doing a god awful job of digging for market insights on pain points.

The core problem with this guy’s argument is that he’s looking in the wrong places - where a SWE would distribute their stuff, not a normie - and then drawing the wrong conclusions. And I am telling you, normies are out there, right now, upchucking some of the sloppiest of slop software you could ever imagine with wanton abandon.

zeckalpha 12/4/2025|||
Interesting, I would make the exact opposite conclusion from the same data: if AI coding was that bad, we'd see more crapware.
voidhorse 12/4/2025||||
Coding Contest != Software Engineering

Or even solving problems that business need to solve, generally speaking.

This complete misunderstand of what software engineering even is is the major reason so many engineers are fed up with the clueless leaders foisting AI tools upon their orgs because they apparently lack the critical reasoning skills to be able to distinguish marketing speak from reality.

bpicolo 12/4/2025||||
Algorithmic coding contests are not an equivalent skillset to professional software development
sethherr 12/3/2025||||
Trying to figure out how to align this with my experiences (which match the parents’ comment), and I have an idea:

Coding contests are not like my job at all.

My job is taking fuzzy human things and making code that solves it. Frankly AI isn’t good at closing open issues on open source projects either.

josephg 12/4/2025||||
I don't think this disproves my claim, for several reasons.

First, I don't know where those human participants came from, but if you pick people off the street or from a college campus, they aren't going to be the world's best programmers. On the other hand, github users are on average more skilled than the average CS student. Even students and beginners who use github usually don't have much code there. If the LLMs are weighted to treat every line of code about same, they'd pick up more lines of code from prolific developers (who are often more experienced) than they would from beginners.

Also in a coding contest, you're under time pressure. Even when your code works, its often ugly and thrown together. On github, the only code I check in is code that solves whatever problem I set out to solve. I suspect everyone writes better code on github than we do in programming competitions. I suspect if you gave the competitors functionally unlimited time to do the programming competition, many more would outperform GPT-4.

Programming contests also usually require that you write a fully self contained program which has been very well specified. The program usually doesn't need any error handling, or need to be maintained. (And if it does need error handling, the cases are all fully specified in the problem description). Relatively speaking, LLMs are pretty good at these kind of problems - where I want some throwaway code that'll work today and get deleted tomorrow.

But most software I write isn't like that. And LLMs struggle to write maintainable software in large projects. Most problems aren't so well specified. And for most code, you end up spending more effort maintaining the code over its lifetime than it takes to write in the first place. Chatgpt usually writes code that is a headache to maintain. It doesn't write or use local utility functions. It doesn't factor its code well. The code is often overly verbose. It often writes code that's very poorly optimized. Or the code contains quite obvious bugs for unexpected input - like overflow errors or boundary conditions. And the code it produces very rarely handles errors correctly. None of these problems really matter in programming competitions. But it does matter a lot more when writing real software. These problems make LLMs much less useful at work.

astrange 12/4/2025||||
Chess AI trained at specific human levels performs better than any humans at those levels, because the random mistakes get averaged out.

https://www.maiachess.com

incrudible 12/4/2025|||
> The fact that LLMs are trained on a corpus does not mean their output represents the median skill level of the corpus.

It does, by default. Try asking ChatGPT to implement quicksort in JavaScript, the result will be dogshit. Of course it can do better if you guide it, but that implies you recognize dogshit, or at least that you use some sort of prompting technique that will veer it off the beaten path.

keeda 12/4/2025||
I asked the free version of ChatGPT to implement quicksort in JS. I can't really see much wrong with it, but maybe I'm missing something? (Ugh, I just can't get HN to format code right... pastebin here: https://pastebin.com/tjaibW1x)

----

function quickSortInPlace(arr, left = 0, right = arr.length - 1) { if (left < right) { const pivotIndex = partition(arr, left, right); quickSortInPlace(arr, left, pivotIndex - 1); quickSortInPlace(arr, pivotIndex + 1, right); } return arr; }

function partition(arr, left, right) { const pivot = arr[right]; let i = left;

  for (let j = left; j < right; j++) {
    if (arr[j] < pivot) {
      [arr[i], arr[j]] = [arr[j], arr[i]];
      i++;
    }
  }

  [arr[i], arr[right]] = [arr[right], arr[i]]; // Move pivot into place
  return i;
}
josephg 12/4/2025||
This is exactly the level of code I've come to expect from chatgpt. Its about the level of code I'd want from a smart CS student. But I'd hope to never use this in production:

- It always uses the last item as a pivot, which will give it pathological O(n^2) performance if the list is sorted. Passing an already sorted list to a sort function is a very common case. Good quicksort implementations will use a random pivot, or at least the middle pivot so re-sorting lists is fast.

- If you pass already sorted data, the recursive call to quickSortInPlace will take up stack space proportional to the size of the array. So if you pass a large sorted array, not only will the function take n^2 time, it might also generate a stack overflow and crash.

- This code: ... = [arr[j], arr[i]]; Creates an array and immediately destructures it. This is - or at least used to be - quite slow. I'd avoid doing that in the body of quicksort's inner loop.

- There's no way to pass a custom comparator, which is essential in real code.

I just tried in firefox:

    // Sort an array of 1 million sorted elements
    arr = Array(1e6).fill(0).map((_, i) => i)
    console.time('x')
    quickSortInPlace(arr)
    console.timeEnd('x')
My computer ran for about a minute then the javascript virtual machine crashed:

    Uncaught InternalError: too much recursion
This is about the quality of quicksort implementation I'd expect to see in a CS class, or in a random package in npm. If someone on my team committed this, I'd tell them to go rewrite it properly. (Or just use a standard library function - which wouldn't have these flaws.)
keeda 12/4/2025||
OK, you just added requirements the previous poster had not mentioned. Firstly, how often do you really need to sort a million elements in a browser anyway? I expect that sort of heavy lifting would usually be done on the server, where you'd also want to do things like paging.

Secondly, if a standard implementation was to be used, that's essentially a No-Op. AI will reuse library functions where possible by default and agents will even "npm install" them for you. This is purely the result of my prompt, which was simply "Can you write a QuickSort implementation in JS?"

In any case, to incorporate your feedback, I simply added "that needs to sort an array of a million elements and accepts a custom comparator?" to the initial prompt and reran in a new session, and this is what I got in less than 5 seconds. It runs in about 160ms on Chrome:

https://pastebin.com/y2jbtLs9

How long would your team-mate have taken? What else would you change? If you have further requirements, seriously, you can just add those to the prompt and try it for yourself for free. I'd honestly be very curious to see where it fails.

However, this exchange is very illustrative: I feel like a lot of the negativity is because people expect AI to read their minds and then hold it against it when it doesn't.

josephg 12/4/2025||
> OK, you just added requirements the previous poster had not mentioned.

Lol of course! The real requirements for a piece of software are never specified in full ahead of time. Figuring out the spec is half the job.

> Firstly, how often do you really need to sort a million elements in a browser anyway? I expect that sort of heavy lifting would usually be done on the server

Who said anything about the browser? I run javascript on the server all the time.

Don't defend these bugs. 1 million items just isn't very many items for a sort function. On my computer, the built in javascript sort function can sort 1 million sorted items in 9ms. I'd expect any competent quicksort implementation to be able to do something similar. Hanging for 1 minute then crashing is a bug.

If you want a use case, consider the very common case of sorting user-supplied data. If I can send a JSON payload to your server and make it hang for 1 minute then crash, you've got a problem.

> If you have further requirements, seriously, you can just add those to the prompt and try it for yourself for free. [..] How long would your team-mate have taken?

We've gotta compare like for like here. How long does it take to debug code like this when an AI generates it? It took me about 25 minutes to discover & verify those problems. That was careful work. Then you reprompted it, and then you tested the new code to see if it fixed the problems. How long did that take, all added together? We also haven't tested the new code for correctness or to see if it has new bugs. Given its a complete rewrite, there's a good chance chatgpt introduced new issues. I've also had plenty of instances where I've spotted a problem and chatgpt apologises then completely fails to fix the problem I've spotted. Especially lifetime issues in rust - its really bad at those!

The question is this: Is this back and forth process faster or slower than programming quicksort by hand? I'm really not sure. Once we've reviewed and tested this code, and fixed any other problems in it, we're probably looking at about an hour of work all up. I could probably implement quicksort at a similar quality in a similar amount of time. I find writing code is usually less stressful than reviewing code, because mistakes while programming are usually obvious. But mistakes while reviewing are invisible. Neither you nor anyone else in this thread spotted the pathological behavior this implementation had with sorted data. Finding problems like that by just looking is hard.

Quicksort is also the best case for LLMs. Its a well understood, well specified problem with a simple, well known solution. There isn't any existing code it needs to integrate with. But those aren't the sort of problems I want chatgpt's help solving. If I could just use a library, I'm already doing that. I want chatgpt to solve problems its probably never seen before, with all the context of the problem I'm trying to solve, to fit in with all the code we've already written. It often takes 5-10 minutes of typing and copy+pasting just to write a suitable prompt. And in those cases, the code chatgpt produces is often much, much worse.

> I feel like a lot of the negativity is because people expect AI to read their minds and then hold it against it when it doesn't.

Yes exactly! As a senior developer, my job is to solve the problem people actually have, not the problem they tell me about. So yes of course I want it to read my mind! Actually turning a clear spec into working software is the easy part. ChatGPT is increasingly good at doing the work of a junior developer. But as a senior dev / tech lead, I also need to figure out what problems we're even solving, and what the best approach is. ChatGPT doesn't help much when it comes to this kind of work.

(By the way, that is basically a perfect definition of the difference between a junior and senior developer. Junior devs are only responsible for taking a spec and turning it into working software. Senior devs are responsible for reading everyone's mind, and turning that into useful software.)

And don't get me wrong. I'm not anti chatgpt. I use it all the time, for all sorts of things. I'd love to use it more for production grade code in large codebases if I could. But bugs like this matter. I don't want to spend my time babysitting chatgpt. Programming is easy. By the time I have a clear specification in my head, its often easier to just type out the code myself.

keeda 12/4/2025||
> Figuring out the spec is half the job.

That's where we come in of course! Look into spec-driven development. You basically encourage the LLM to ask questions and hash out all these details.

> Who said anything about the browser?... Don't defend these bugs.

A problem of insufficient specification... didn't expect an HN comment to turn into an engineering exercise! :-) But these are the kinds of things you'd put in the spec.

> How long does it take to debug code like this when an AI generates it? It took me about 25 minutes to discover & verify those problems.

Here's where it gets interesting: before reviewing any code, I basically ask it to generate tests, which always all pass. Then I review the main code and test code, at which point I usually add even more test-cases (e.g. https://news.ycombinator.com/item?id=46143454). And, because codegen is so cheap, I can even include performance tests, (which statistically speaking, nobody ever does)!

Here's a one-shot result of that approach (I really don't mean to take up more of your time, this is just so you can see what it is capable of): https://pastebin.com/VFbW7AKi

While I do review the code (a habit -- I always review my own code before a PR), I review the tests more closely because, while boring, I find them a) much easier to review, and b) more confidence-inspiring than manual review of intricate logic.

> I want chatgpt to solve problems its probably never seen before, with all the context of the problem I'm trying to solve, to fit in with all the code we've already written.

Totally, and again, this is where we come in! Still, it is a huge productivity booster even in uncommon contexts. E.g. I'm using it to do computer vision stuff (where I have no prior background!) with opencv.js for a problem not well-represented in the literature. It still does amazingly well... with the right context. It's initial suggestions were overindexed on the common case, but over many conversations, it "understood" my use-case and consistently gives appropriate suggestions. And because it's vision stuff, I can instantly verify results by sight.

Big caveat: success depends heavily on the use-case. I have had more mixed results in other cases, such as concurrency issues in an LD_PRELOAD library in C. One reason for the mixed sentiments we see.

> ChatGPT is increasingly good at doing the work of a junior developer.

Yes, and in fact, I've been rather pessimistic about the prospects of junior developers, a personal concern given I have a kid who wants to get into software engineering...

I'll end with a note that my workflow today is extremely different from before AI, and it took me many months of experimentation to figure out what worked for me. Most engineers simply haven't had the time to do so, which is another reason we see so many mixed sentiments. But I would strongly encourage everybody to invest the time and effort because this discipline is going to change drastically really quickly.

malfist 12/4/2025|||
By volume the vast majority of code on github is students. Think about that when you average github for ai
pseudalopex 12/4/2025||
> By volume the vast majority of code on github is students.

Who determined this? How?

teaearlgraycold 12/3/2025||||
I saw an ad for Lovable. The very first thing I noticed was an exchange where the promoter asked the AI to fix a horizontal scroll bar that was present on his product listing page. This is a common issue with web development, especially so for beginners. The AI’s solution? Hide overflow on the X axis. Probably the most common incorrect solution used by new programmers.

But to the untrained eye the AI did everything correctly.

rhetocj23 12/3/2025||||
Yes. The people who are amazed with AI were never that good at a particular subject area in the first place - I dont care who you are. You were not good enough - how do I know this? Well I know economics, corporate finance, accounting et al very deeply. Ive engaged with LLMS for years now and still they cannot get below the surface level and are not improving further than this.

Its easy to recall information, but something entirely different to do something with that information. Which is what those subject ares are all about - taking something (like a theory) and applying it in a disciplined manner given the context.

Thats not to diminish what LLMs can do. But lets get real.

dingnuts 12/3/2025|||
[dead]
scotty79 12/3/2025||||
It works both ways. If you are good, it's also easier to spot moments of brilliance from AI agent when it saves you hours of googling, reading docs, some trial and error while you pour yourself cup of coffee and ponder the next steps. You can spot when a single tab press saved you minutes.
newAccount2025 12/4/2025|||
Yes. Love it for quick explorations of available options, reviewing my work, having it propose tests, getting its help with debugging, and all kinds of general subject matter questions. I don’t trust it to write anything important but it can help with a sketch.
myfavoritedog 12/4/2025|||
[dead]
RajT88 12/4/2025||||
I am not a great (some would argue, not even good) programmer, and I find a lot of issues with LLM generated code. Even Claude pro does really weird dumb stuff.
CyberDildonics 12/4/2025||||
It starts to make you realize how unaware many people must be of what their programs are doing to accept AI stuff wholesale.
codethief 12/4/2025||||
> Generally I've noticed the better someone is at coding the more critical they are of AI generated code.

I'm having a dejavu of yesterday's discussion: https://news.ycombinator.com/item?id=46126988

mountainriver 12/4/2025||||
IMO this is mostly just an ego thing. I often see staff+ engineers make up reasons why AI is bad, when really it’s just a prompting skill issue.

When something threatens a thing that gives you value, people tend to hate it

throwaway29812 12/3/2025|||
This seems to be the overall trend in AI. If you're an expert in something, you can see where it's wrong. If you're not, you can't.
gipp 12/3/2025||||
Engineers at Google are much less likely to be doing green-field generation of large amounts of code . It's much more incremental, carefully measured changes to mature, complex software stacks, and done within the Google ecosystem, which is heavily divergent from the OSS-focused world of startups, where most training data comes from
karmasimida 12/3/2025|||
That is the problem.

AI is optimized to solve a problem no matter what it takes. It will try to solve one problem by creating 10 more.

I think long time/term agentic AI is just snake oil at this point. AI works best if you can segment your task into 5-10 minutes chunks, including the AI generating time, correcting time and engineer review time. To put it another way, a 10 minute sync with human is necessary, otherwise it will go astray.

Then it just makes software engineering into bothering supervisor job. Yes I typed less, but I didn’t feel the thrill of doing so.

citizenpaul 12/3/2025||
> it just makes software engineering into bothering supervisor job.

I'm pretty sure this is the entire enthusiasm from C-level for AI in a nutshell. Until AI SWE resisted being mashed into a replaceable cog job that they don't have to think/care about. AI is the magic beans that are just tantalizingly out of reach and boy do they want it.

spwa4 12/3/2025|||
But every version of AI for almost a century had this property, right down from the first vocoders that were going to replace entire callcenters to convolutional AI that was going to give us self-driving cars. Yes, a century, vocoders were 1930s technology, but they can essentially read the time aloud.

... except they didn't. In fact most AI tech were good for a nice demo and little else.

In some cases, really unfairly. For instance, convnet map matching doesn't work well not because it doesn't work well, but because you can't explain to humans when it won't work well. It's unpredictable, like a human. If you ask a human to map a building in heavy fog they may come back with "sorry". SLAM with lidar is "better", except no, it's a LOT worse. But when it fails it's very clear why it fails because it's a very visual algorithm. People expect of AIs that they can replace humans but that doesn't work, because people also demand AIs never say no, never fail, like the Star Trek computer (the only problem the star trek computer ever has is that it is misunderstood or follows policy too well). If you have a delivery person occasionally they will radically modify the process, or refuse to deliver. No CEO is ever going to allow an AI drone to change the process and No CEO will ever accept "no" from an AI drone. More generally, no business person seems to ever accept a 99% AI solution, and all AI solutions are 99%, or actually mostly less.

AI winters. I get the impression another one is coming, and I can feel it's going to be a cold one. But in 10 years, LLMs will be in a lot of stuff, like with every other AI winter. A lot of stuff ... but a lot less than CEOs are declaring it will be in today.

voidhorse 12/4/2025|||
Luckily for us, technologies like SQL made similar promises (for more limited domains) and C suites couldn't be bothered to learn that stuff either.

Ultimately they are mostly just clueless, so we will either end up with legions of way shittier companies than we have today (because we let them get away with offloading a bunch of work to tools they rms int understand and accepting low quality output) or we will eventually realize the continued importance of human expertise.

groby_b 12/3/2025||||
There are plenty of good tasks left, but they're often one-off/internal tooling.

Last one at work: "Hey, here are the symptoms for a bug, they appeared in <release XYZ> - go figure out the CL range and which 10 CLs I should inspect first to see if they're the cause"

(Well suited to AI, because worst case I've looked at 10 CLs in vain, and best case it saved me from manually scanning through several 1000 CLs - the EV is net positive)

It works for code generation as well, but not in a "just do my job" way, more in a "find which haystack the needle is in, and what the rough shape of the new needle is". Blind vibecoding is a non-starter. But... it's a non-starter for greenfields too, it's just that the FO of FAFO is a bit more delayed.

ethbr1 12/3/2025||
My internal mnemonic for targeting AI correctly is 'It's easier to change a problem into something AI is good at, than it is to change AI into something that fits every problem.'

But unfortunately the nuances in the former require understanding strengths and weaknesses of current AI systems, which is a conversation the industry doesn't want to have while it's still riding the froth of a hype cycle.

Aka 'any current weaknesses in AI systems are just temporary growing pains before an AGI future'

groby_b 12/3/2025||
> 'any current weaknesses in AI systems are just temporary growing pains before an AGI future'

I see we've met the same product people :)

ethbr1 12/3/2025||
I had a VP of a revenue cycle team tell me that his expectation was that they could fling their spreadsheets and Word docs on how to do calculations at an AI powered vendor, and AI would be able to (and I direct quote) "just figure it all out."

That's when I realized how far down the rabbit hole marketing to non-technical folks on this was.

almostdeadguy 12/3/2025||||
I think it’s a fair point that google has more stakeholders with a serious investment in some flubbed AI generated code not tanking their share value, but I’m not sure the rest of it is all that different from what engineer at $SOME_STARTUP does after the first ~8monthes the company is around. Maybe some folks throwing shit at a wall to find PMF are really getting a lot out of this, but most of us are maintaining and augmenting something we don’t want to break.
kccqzy 12/3/2025|||
Yeah but Google won’t expect you to use AI tools developed outside Google and trained on primarily OSS code. It would expect you to use the Google internal AI tools trained on google3, no?
xoogthrowkappa 12/4/2025||||
Excuse the throwaway. It's not even just the employees, but it doesn't even seem like the technical leadership seriously cares about internal AI use. Before I left all they pushed was code generation, but my work was 80% understanding 5-20 year old code and 20% actual development. If they put any noticeable effort into an LLM that could answer "show me all users of Proto.field that would be affected by X", my life would've been changed for the better, but I don't think the technical leadership understands this, or they don't want to spare the TPUs.

When I started at my post-Google job, I felt so vindicated when my new TL recommended that I use an LLM to catch up if no one was available to answer my questions.

3vidence 12/4/2025||||
Googler, opinion is my own.

Working on our mega huge code basis with lots of custom tooling and bleeding edge stuff hasn't been the best for for AI generated code compared to most companies.

I do think AI as a rubber ducky / research assistant type has been overall helpful as a SWE.

nunez 12/4/2025||||
Makes sense to me.

From the outside, the AI push at Google very closely resembles the death march that Google+ but immensely more intense from the entire tech ecosystem following suit.

Arainach 12/4/2025||||
Being forced to adopt tools regardless of fit to workflow (and being smart enough to understand the limitations of the tools despite management's claims) correlates very well to being negative on them.
fogj094j0923j4 12/4/2025||||
I notice that expert tends to be pretty bimodal. e.g. chef either enjoy really well made food or some version of scrappy fast food comfort they grew up eating.
petesergeant 12/4/2025||
Bimodal here suggests either/or which I don’t think is correct for either chefs or code enjoyers. I think experts tend to eschew snobbery more and can see the value in comfort food, quick and dirty AI prototypes or boilerplate, or say cheap and drinkable wine, while also being able to appreciate what the truly high-end looks like.

It’s the mid-range with pretensions that gets squeezed out. I absolutely do not need a $40 bottle of wine to accompany my takeout curry, I definitely don’t need truffle slices added to my carbonara, and I don’t need to hand-roll conceptually simple code.

anukin 12/4/2025||||
You cannot trust someone’s judgement on something if that something can result in them being unemployed.
llbeansandrice 12/4/2025||
Or if they stand to make a lot of money.

See both sides can be pithy.

volf_ 12/4/2025||||
because autocorrect and predictive text doesn't help when half your job is revisions
agumonkey 12/3/2025||||
so would love to be a fly in there office and hear all their convos
ta9000 12/4/2025||||
[dead]
nilkn 12/4/2025|||
People who've spent their life perfecting a craft are exactly the people you'd expect would be most negative about something genuinely disrupting that craft. There is significant precedent for this. It's happened repeatedly in history. Really smart, talented people routinely and in fact quite predictably resist technology that disrupts their craft, often even at great personal cost within their own lifetime.
queenkjuul 12/4/2025|||
I don't know that i consider recognizing the limitations of a tool to be resistance to the idea. It makes sense that experts would recognize those limitations most acutely -- my $30 harbor freight circular saw is a lifesaver for me when I'm doing slapdash work in my shed, but it'd be a critical liability for a professional carpenter needing precision cuts. That doesn't mean the professional carpenter is resistant to the idea of using power saws, just that they necessarily must be more discerning than I do.
rprend 12/4/2025|||
Yes you get it. Obviously “writing code” will die. It will hold on in legacy systems that need bespoke maintenance, like COBOL systems have today. There will be artisanal coders, like there are artisanal blacksmiths, who do it the old fashioned way, and we will smile and encourage them. Within 20 years, writing code syntax will be like writing assembly: something they make you do in school, something that your dad reminds you about the good old days.

I talked to someone who was in denial about this, until he said he had conflated writing code with solving problems. Solving problems isn’t going anywhere! Solving problems: you observe a problem, write out a solution, implement that solution, measure the problem again, consider your metrics, then iterate.

“Implement it” can mean writing code, like the past 40 years, but it hasn’t always been. Before coding, it was economics and physics majors, who studied and implemented scientific management. For the next 20 years, it will be “describe the tool to Claude code and use the result”.

xwolfi 12/4/2025||
But Claude cannot code at all, it's gonna shit the bed and it learns only on human coders to be able to even know an example is a solution rather than a malware...
rprend 12/4/2025||
Every greenfield project uses claude code to write 90+% of code. Every YC startup for the past six months says AI writes 90+% of their code. Claude code writes 90+% of my code. That’s today.

It works great. I have a faster iteration cycle. For existing large codebases, AI modifications will continue to be okay-ish. But new companies with a faster iteration cycle will outcompete olds ones, and so in the long run most codebases will use the same “in-distribution” tech stacks and architecture and design principles that AI is good at.

brazukadev 12/10/2025|||
That explains the low quality of all launch HN of the past 6 months
pseudalopex 12/4/2025|||
> Every greenfield project uses claude code to write 90+% of code.

Who determined this? How?

hectdev 12/3/2025|||
It's the latest tech holy war. Tabs vs Spaces but more existential. I'm usually anti hype and I've been convinced of AI's use over and over when it comes to coding. And whenever I talk about it, I see that I come across as an evangelist. Some people appreciate that, online I get a lot of push back despite having tangible examples of how it has been useful.
suprjami 12/3/2025|||
I don't see it that way. Tabs, spaces, curly brace placement, Vim, Emacs, VSCode, etc are largely aesthetic choices with some marginal unproven cognitive implications.

I find people mostly prefer what they are used to, and if your preference was so superior then how could so many people build fantastic software using the method you don't like?

AI isn't like that. AI is a bunch of people telling me this product can do wonderful things that will change society and replace workers, yet almost every time I use it, it falls far short of that promise. AI is certainly not reliable enough for me to jeopardize the quality of my work by using it heavily.

dwaltrip 12/3/2025|||
You can vibe-code a throwaway UI for investigating some complex data in less than 30 minutes. The code quality doesn't matter, and it will make your life much easier.

Rinse and repeat for many "one-off" tasks.

It's not going away, you need to learn how to use it. shrugs shoulders

bigger_cheese 12/3/2025|||
The issue is people trying to use these AI tools to investigate complex data not the throwaway UI part.

I work as the non-software kind of engineer at an industrial plant there is starting to emerge a trend of people who just blindly trust the output of AI chat sessions without understanding what the chat bot is echoing at them which is wasteful of their time and in some cases my time.

This not not new in the past I have experienced engineers who use (abuse) statistics/regression tools etc. Without understanding what the output was telling them but it is getting worse now.

It is not uncommon to hear something like: "Oh I investigated that problem and this particular issue we experienced was because of reasons x, y and z."

Then when you push back because what they've said sounds highly unlikely it boils down to. "I don't know that is what the AI told me".

Then if they are sufficiently optimistic they'll go back and prompt it with "please supply evidence for your conclusion" or some similar prompt and it will supply paragraphs of plausible sounding text but when you dig into what it is saying there are inconsistencies or made up citations. I've seen it say things that were straight up incorrect and went against Laws of Thermodynamics for example.

It has become the new "I threw the kitchen sink into a multivariate regression and X emerged as significant - therefore we should address x"

I'm not a complete skeptic I think AI has some value, for example if you use it as a more powerful search engine by asking it something like "What are some suggested techniques for investigating x" or "What are the limitations of Method Y" etc. It can point you to the right place assist you with research, it might find papers from other fields or similar. But it is not something you should be relying on to do all of the research for you.

BearOso 12/4/2025||||
But how do you know you're getting the correct picture from that throwaway UI? A little while back there was an blog posted where the author wrote an article praising AI for his vibe-coded earth-viewer app that used Vulkan to render inside a GUI window. Unfortunately, that wasn't the case and AI just copied from somewhere and inserted code for a rudimentary software rendering. The AI couldn't do what was asked because it had seldom been done. Nobody on the internet ever discussed that particular objective, so it wasn't in the training set.

The lesson to learn is that these are "large-language models." That means it can regurgitate what someone else has done before textually, but not actually create something novel. So it's fine if someone on the internet has posted or talked about a quick UI in whatever particular toolkit you're using to analyze data. But it'll throw out BS if you ask for something brand new. I suspect a lot of AI users are web developers who write a lot of repetitive rote boilerplate, and that's the kind of thing these LLMs really thrive with.

keeda 12/4/2025||
> But how do you know you're getting the correct picture from that throwaway UI?

You get the AI to generate code that lets you spot-check individual data points :-)

Most of my work these days is in fact that kind of code. I'm working on something research-y that requires a lot of visualization, and at this point I've actually produced more throwaway code than code in the project.

Here's an example: I had ChatGPT generate some relatively straightforward but cumbersome geometric code. Saved me 30 - 60 minutes right there, but to be sure, I had it generate tests, which all passed. Another 30 minutes saved.

I reviewed the code and the tests and felt it needed more edge cases, which I added manually. However, these started failing and it was really cumbersome to make sense of a bunch of coordinates in arrays.

So I had it generate code to visualize my test cases! That instantly showed me that some assertions in my manually added edge cases were incorrect, which became a quick fix.

The answer to "how do you trust AI" is human in the loop... AND MOAR AI!!! ;-)

dwaltrip 12/3/2025||||
It’s kind of fun watching this comment go up and down :)

There’s so much evidence out there of people getting real value from the tools.

Some questions you can ask yourself are “why doesn’t it work for me?” and “what can I do differently?”.

Be curious, not dogmatic. Ignore the hype, find people doing real work.

SpicyLemonZest 12/3/2025|||
They're good questions! The problem is that I've tried to talk to the people who are getting real value from it, and often the answer ends up being that the value is not as real as they think. One guy gave an excited presentation about how AI let him write 7k LOC per day, expounded for an entire session about how the rest of us should follow in his shoes, and then clarified only in Q&A that reviewers couldn't keep up so he exempted himself from code review.
adriand 12/3/2025|||
I’m starting to believe there are situations where the human code review is genuinely not necessary. Here’s a concrete example of something that’s been blowing my mind. I have 25 years of professional coding experience but it’s almost all web, with a few years of iOS in the objective C era. I’m also an amateur electronic musician. A couple of weeks ago I was thinking about this plugin that I used to love until the company that made it went under. I’ve long considered trying to make a replacement but I don’t know the first thing about DSP or C++.

You know where this is going. I asked Claude if audio plugins were well represented in its training data, it said yes, off I went. I can’t review the code because I lack the expertise. It’s all C++ with a lot of math and the only math I’ve needed since college is addition and calculating percentages. However, I can have intelligent discussions about design and architecture and music UX. That’s been enough to get me a functional plugin that already does more in some respects than the original. I am (we are?) making it steadily more performant. It has only crashed twice and each time I just pasted the dump into Claude and it fixed the root cause.

Long story short: if you can verify the outcome, do you need to review the code? It helps that no one dies or gets underpaid if my audio plugin crashes. But still, you can’t tell me this isn’t remarkable. I think it’s clear there will be a massive proliferation of niche software.

deltaburnt 12/4/2025|||
I don’t think I’ve ever seen someone seriously argue that personal throwaway projects need thorough code reviews of their vibe code. The problem comes in when I’m maintaining a 20 year old code base used by anywhere from 1M to 1B users.

In other words you can’t vibe code in an environment where evaluating “does this code work” is an existential question. This is the case where 7k LOC/day becomes terrifying.

Until we get much better at automatically proving correctness of programs we will need review.

adriand 12/4/2025|||
My point about my experience with this plugin isn’t that it’s a throwaway or meaningless project. My point is that it might be enough in some cases to verify output without verifying code. Another example: I had to import tens of thousands of records of relational data. I got AI to write the code for the import. All I verified was that the data was imported correctly. I didn’t even look at the code.
deltaburnt 12/4/2025||
In this context I meant throwaway as "low stakes" not "meaningless". Again, evaluating the output of a database import like that could be existensial for your company given the context. Not to mention there's many cases where evaluating the output isn't feasible for a human.
nilkn 12/4/2025|||
Human code review does not prove correctness. Almost every software service out there contains bugs. Humans have struggled for decades to reliably produce correct software at scale and speed. Overall, humans have a pretty terrible track record of producing bug-free correct code no matter how much they double-check and review their code along the way.
cyphar 12/4/2025||
So the solution is to stop doing code reviews and just YOLO-merge everything? After all, everything is fucked already, how much worse could it get?

For the record, there are examples where human code review and design guidelines can lead to very low-bug code. NASA published their internal guidelines for producing safety-critical code[1]. The problem is that the development cost of software when using such processes is too high for most companies, and most companies don't actually produce safety-critical software.

My experience with the vast majority of LLM code submitted to projects I maintain is that it has subtle bugs that I managed to find through fairly cursory human review. The copilot code review feature on GitHub also tends to miss actual bugs and report nonexistent bugs, making it worse than useless. So in my view, the death of the benefits of human code review have been wildly exaggerated.

[1]: https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Dev...

nilkn 12/4/2025||
No, that's not what I wrote, and it's not the correct conclusion. What I wrote (and what you, in fact, also wrote) is that in reality we generally do not actually need provably correct software except in rare cases (e.g., safety-critical applications). Suggesting that human review cannot be reduced or phased out at all until we can automatically prove correctness is wrong, because fully 100% correct and bug-free software is not needed for the vast majority of code being produced. That does not mean we immediately throw out all human review, but the bar for making changes for how we review code is certainly much lower than the above poster suggested.
deltaburnt 12/4/2025||
I don't really buy your premise. What you're suggesting is that all code has bugs, and those bugs have equal severity and distribution regardless of any forethought or rigor put into the code.

You're right, human review and thorough design are a poor approximation of proving assumptions about your code. Yes bugs still exist. No you won't be able to prove the correctness of your code.

However, I can pretty confidently assume that malloc will work when I call it. I can pretty confidently assume that my thoroughly tested linked list will work when I call it. I can pretty confidently assume that following RAII will avoid most memory leaks.

Not all software needs meticulous careful human review. But I believe that the compounding cost of abstractions being lost and invariants being given up can be massive. I don't see any other way to attempt to maintain those other than human review or proven correctness.

nilkn 12/4/2025||
I did suggest all code has bugs (up to some limit -- while I wasn't careful to specify this, as discussed above, there does exist an extraordinary level of caution and review that if used can approximate perfect bug-free code, as in your malloc example and in the example of NASA, but that standard is not currently applied to 99.9% of human-generated and human-reviewed code, and it doesn't need to be). I did not suggest anything else you said I suggested, so I'm not sure why you made those parts up.

"Not all software needs meticulous careful human review" is exactly the point. The question of exactly what software needs that kind of review is one whose answer I expect to change over the next 5-10 years. We are already at the point where it's so easy to produce small but highly non-trivial one-off applications that one needn't examine the code at all -- I completely agree with the above poster that we're rapidly discovering new examples of software development where output-verification is all you need, just like right now you don't hand-inspect the machine code generated by your compiler. The question is how far that will be able to go, and I don't think anybody really knows right now, except that we are not yet at the threshold. You keep bringing up examples where the stakes are "existential", but you're underestimating how much software development does not have anything close to existential stakes.

SpicyLemonZest 12/4/2025|||
I agree that's remarkable, and I do expect a proliferation of LLM-assisted development in similar niches where verification is easy and correctness isn't critical. But I don't think most software developers today are in such niches.
stocksinsmocks 12/3/2025||||
Most enterprise software I use has serious defects. Professional CAD software for infrastructure is awful. Many are just incremental improvements piled upon software from the 1990s. Bugs last for decades because nobody can understand how the program works so they just work on one more little VBA plugin at a time. Meanwhile, the capabilities of these programs have fallen completely behind game studios with no budget and no business plan. Where are the results of this human excellence and code quality process? There are 10s of thousands of new CVEs every year from code hand crafted by artisans on their very own MacBooks. How? Perhaps there is the tiny possibility that maybe code quality is mostly an aesthetic judgment that nobody can really define, and just maybe this effort is mostly spent on vague concepts like maintainability or preferential decisions instead of the basics: does it meet the specification? Is the performance getting better or worse?

This is the game changer for me: I don’t have to evaluate tens or hundreds of market options that fit my problem. I tell the machine to solve it, and if it works, then I’m happy. If it doesn’t I throw it away. All in a few minutes and for a few cents. Code is going the way of the disposable diaper, and, if you ever washed a cloth diaper you will know, that’s a good thing.

SpicyLemonZest 12/4/2025||
> I tell the machine to solve it, and if it works, then I’m happy. If it doesn’t I throw it away.

What happens when it seems to work, and you walk away happy, but discover three months later that your circular components don't line up because the LLM-written CAD software used an over-rounded PI = 3.14? I don't work in industrial design, but I faced a somewhat similar issue where an LLM-written component looked fine to everyone until final integration forced us to rewrite it almost entirely.

queenkjuul 12/4/2025||
This is basically me at my job right now. My boss used Claude Code in his spare time to write a "proof of concept" Electron app. It mostly worked but had some weird edge case behaviors. Now it's handed off to me, and fixing those edge cases is requiring me to refactor basically every single thing Claude touched. Vast majority I'm just tossing and redoing from scratch.

The original code "looks" fine, and it works pretty well even, but an LLM cannot avoid critical oversights along the way, and is fundamentally designed to its mistakes look as plausibly correct as possible. This makes correcting the problems down the line much more annoying (unless you can afford to live with the bugs and keep slapping on more band aids, i guess)

samdoesnothing 12/3/2025||||
Most people don't have a problem with using genai for stuff like throwaway UI's. That's not even remotely relevant to the criticisms. People reject having it forced down their throats by companies who are desperate to make us totally reliant on it to justify their insane investments. And people reject the evangelicals who claim that it's going to replace developers because it can spit out mostly working boilerplate.
mattgreenrocks 12/3/2025||||
I’m an AI skeptic. I like seeing what UIs it spits out, though, which defeats the blank page staring into my soul fear nicely. I don’t even use the code, just take inspiration from the layouts.
scotty79 12/3/2025||
Yeah, it helps a lot to make first steps, to overcome writers block, to make you put into words what you'd like to have built.

At one point you might take over, ask it for specific refactors you'd do but are too lazy to do yourself. Or even toss it away entirely and start fresh with better understanding. Yourself or again with agent.

pydry 12/4/2025|||
It's like watching somebody argue that code linting is going to change the face of the world and the rebuttals to the skeptics are arguing that akshually code linting is quite useful....
icedchai 12/3/2025||||
I have found value for one off tasks. I forget the exact situation, but I wanted to do some data transformation, something that would normally take me a half hour of awk/sed/bash or python scripting. AI spit it out right away.
thewebguyd 12/3/2025||||
> You can vibe-code a throwaway UI for investigating some complex data in less than 30 minutes. The code quality doesn't matter, and it will make your life much easier.

I think the throwaway part is important here and people are missing it, particularly for non-programmers.

There's a lot of roles in the business world that would make great use of ephemeral little apps like this to do a specific task, then throw it away. Usually just running locally on someone's machine, or at most shared with a couple other folks in your department.

Code doesn't have to be good, hell it doesn't even have to be secure, and certainly doesn't need to look pretty. It just needs to work.

There's not enough engineering staff or time to turn every manager's pet excel sheet project into a temporary app, so LLMs make perfect sense here.

I'd go as far to say more effort should be put into ephemeral apps as a use case for LLMs over focusing on trying to use them in areas where a more permanent, high quality solution is needed.

Improve them for non-developers.

Arainach 12/4/2025||||
>You can vibe-code a throwaway UI

And then people create non-throwaway things with it and your job, performance report, bonus, and healthcare are tied to being compared to those people who just do what management says without arguing about the correct application of the tool.

If you keep your job, it's now tied to maintaining the garbage those coworkers checked in.

xorcist 12/3/2025||||
Perhaps. But does it matter? There is a million tools to investigate complex data already. Are you suggesting it is more useful to develop a new tool from scratch, using LLM-type tools, than it is to use a mature tool for data analysis?

If you don't know how to analyze data, and flat out refuse to invest in learning the skill, then I guess that could be really useful. Those users are likely the ones most enthusiastic about AI. But are those users close to as productive as someone who learns a mature tool? Not even close.

Lots of people appreciate an LLM to generate boiler plate code and establish frameworks for their data structures. But that's code that probably shouldn't be there in the first place. Vibe coding a game can be done impressively quick, but have you tried using a game construction kit? That's much faster still.

gigel82 12/4/2025||||
Except when your AI psychosis PM / manager sees your throwaway vibe-coded garbage and demands it gets shipped to customers.

It's infinitely worse when your PM / manager vibe-codes some disgusting garbage, sees that it kind of looks like a real thing that solves about half of the requirements (badly) and demands engineers ship that and "fix the few remaining bugs later".

llbeansandrice 12/4/2025||||
You should try telling management it’s throwaway
area51org 12/4/2025||||
One thing people often don't realize or ignore: these LLMs are trained on the internet, the entire internet.

There's a shit-ton of bad and inefficient code on the internet. Lots of it. And it was used to train these LLMs as much as the good code.

In other words, the LLMs are great if you're OK with mediocrity at best. Mediocrity is occasionally good enough, but it can spell death for a company when key parts of it are mediocre.

I'm afraid a lot of the executives who fantasize about replacing humans with AI are going to have to learn this the hard way.

hectdev 12/3/2025||||
I would say it is like that. No one HAS to use AI. But the shared goal is to get a change to the codebase to achieve a desired outcome. Some will outsource a significant part of that to AI, some won't.

And its tricky because I'm trying not to appeal to emotion despite being fascinated with how this tool has enabled me to do things in a short amount of time that it would have taken me weeks of grinding to get to and improves my communication with stakeholders. That feels world changing. Specifically my world and the day-to-day roll I play when it comes to getting things done.

I think it is fine that it fell short of your expectations. It often does for me as well but it's when it gets me 80% of the way there in less than a day's work, then my mind is blown. It's an imperfect tool and I'm sorry for saying this but so are we. Treat its imperfections in the same way you would with a JR developer- feedback, reframing, restrictions, and iterate.

Freak_NL 12/3/2025|||
> No one HAS to use AI.

Well… That's no longer true, is it?

My partner (IT analyst) works for a company owned by a multinational big corporation, and she got told during a meeting with her manager that use of AI is going to become mandatory next year. That's going to be a thing across the board.

And have you called a large company for any reason lately? Could be your telco provider, your bank, public transport company, whatever. You call them, because online contact means haggling with an AI chatbot first to finally give up and shunt you over to an actual person who can help, and contact forms and e-mail have been killed off. Calling is not exactly as bad, but step one nowadays is 'please describe what you're calling for', where some LLM will try to parse that, fail miserably, and then shunt you to an actual person.

AI is already unavoidable.

palmotea 12/3/2025|||
> My partner (IT analyst) works for a company owned by a multinational big corporation, and she got told during a meeting with her manager that use of AI is going to become mandatory next year. That's going to be a thing across the board.

My multinational big corporation employer has reporting about how much each employee uses AI, with a naughty list of employees who aren't meeting their quota of AI usage.

ryandrake 12/3/2025|||
Nothing says "this product is useful" quite like forcing people to use it and punishing people who don't. If it was that good, there'd be organic demand to use it. People would be begging to use it, going around their boss's back to use it.

The fact that companies have to force you to use it with quotas and threats is damning.

anon7725 12/4/2025||||
> My multinational big corporation employer has reporting about how much each employee uses AI, with a naughty list of employees who aren't meeting their quota of AI usage.

“Why don’t you just make the minimum 37 pieces of flAIr?”

groby_b 12/3/2025||||
Yeah. Well. There are company that require TPS reports, too.

It's mostly a sign leadership has lost reasoning capability if it's mandatory.

But no, reporting isn't necessarily the problem. There are plenty of places that use reporting to drive a conversation on what's broken, and why it's broken for their workflow, and then use that to drive improvement.

It's only a problem if the leadership stance is "Haha! We found underpants gnome step 2! Make underpants number go up, and we are geniuses". Sadly not as rare as one would hope, but still stupid.

int_19h 12/4/2025|||
Those kinds of reports seem to be a thing at all big tech corps now.
oarsinsync 12/3/2025||||
> And have you called a large company for any reason lately? Could be your telco provider, your bank, public transport company, whatever. You call them, because online contact means haggling with an AI chatbot first to finally give up and shunt you over to an actual person who can help, and contact forms and e-mail have been killed off. Calling is not exactly as bad, but step one nowadays is 'please describe what you're calling for', where some LLM will try to parse that, fail miserably, and then shunt you to an actual person

All of this predates LLMs (what “AI” means today) becoming a useful product. All of this happened already with previous generations of “AI”.

It was just even shittier than the version we have today.

pxc 12/4/2025||
It was also shittier than the version we had before it (human receptionists).

This is what I always think of when I imagine how AI will change the world and daily life. Automation doesn't have to be better (for the customer, for the person using it, for society) in order to push out the alternatives. If the automation is cheap enough, it can be worse for everyone, and still change everything. Those are the niches in ehich I'm most certain will be here to stay— because sometimes, it hardly matters if it's any good.

kentm 12/3/2025||||
> where some LLM will try to parse that, fail miserably, and then shunt you to an actual person.

If you're lucky. I've had LLMs that just repeatedly hang up on me when they obviously hit a dead end.

hectdev 12/3/2025||||
It isn't a universal thing. I have no doubt there is a job out there that that isn't a requirement. I think the issue is the C-level folks are seeing how more productive someone might be and making it a demand. That to me is the wrong approach. If you demonstrate and build interest, the adoption will happen.
groby_b 12/3/2025|||
As opposed to reaching, say, somebody in an offshored call center with an utterly undecipherable accent reading a script at you? Without any room for deviation?

AI's not exactly a step down from that.

moduspol 12/3/2025|||
> But the shared goal is to get a change to the codebase to achieve a desired outcome.

I'd argue that's not true. It's more of a stated goal. The actual goal is to achieve the desired outcome in a way that has manageable, understood side effects, and that can be maintained and built upon over time by all capable team members.

The difference between what business folks see as the "output" of software developers (code) and what (good) software developers actually deliver over time is significant. AI can definitely do the former. The latter is less clear. This is one of the fundamental disconnects in discussions about AI in software development.

hectdev 12/3/2025||
In my personal use case, I work at a company that has SO MUCH process and documentation for coding standards. I made an AI agent that knows all that and used it to update legacy code to the new standard in a day. Something that would have taken weeks if not more. If your desire is manageable code, make that a requirement.

I'm going to say this next thing as someone with a lot of negative bias about corporations. I was laid off from Twitter when Elon bought the company and at a second company that was hemorrhaging users.

Our job isn't to write code, it's to make the machine do the thing. All the effort for clean, manageable, etc is purely in the interest of the programmer but at the end of the day, launching the feature that pulls in money is the point.

SpicyLemonZest 12/3/2025|||
How did you verify that your AI agent performed the update correctly? I've experienced a number of cases where an AI agent made a change that seemed right at first glance, maybe even passed code review, but fell apart completely when it came time to build on top of it.
spopejoy 12/4/2025|||
> made a change that seemed right at first glance, maybe even passed code review, but fell apart completely when it came time to build on top of it

Maybe I'm not understanding you're point, but this is the kind of thing that happens in software teams all the time and is one of those "that's why they call it work" realities of the job.

If something "seems right/passed review/fell apart" then that's the reviewer's fault right? Which happens, all the time! Reviewers tend to fall back to tropes and "is there tests ok great" and whatever their hobbyhorses tend to be, ignoring others. It's ok because "at least it's getting reviewed" and the sausage gets made.

If AI slashed the amount of time to get a solution past review, it buys you time to retroactively fix too, and a good attitude when you tell it that PR 1234 is why we're in this mess.

prewett 12/4/2025||
> If something "seems right/passed review/fell apart" then that's the reviewer's fault right?

No, it's the author's fault. The point of a code review is not to ensure correctness, it is to improve code quality (correctness, maintainability, style consistency, reuse of existing functions, knowledge transfer, etc).

spopejoy 12/4/2025||
I mean, that's just not true when you're talking about varying levels of experience. Review is _very_ important with juniors, obviously. If you as sr eng let a junior put code in the codebase that messes up later, you share that blame for sure.
hectdev 12/3/2025|||
Unit tests, manual testing the final product, PR with two approvals needed (and one was from the most anal retentive reviewer at the company who is heavily invested in the changes I made), and QA.
moduspol 12/4/2025|||
It's not just about coding standards. It's about, over time, having a team of people with a built-up set of knowledge about how things work and how they're expected to work. You don't get that by vibe coding and reviewing numerous PRs written by other people (or chatbots).

If everyone on your team is doing that, it's not long before huge chunks of your codebase are conceptually like stuff that was written a long time ago by people who left the company. Except those people may have actually known what they were doing. The AI chatbots are generating stuff that seems to plausibly work well enough based on however they were prompted.

There are intangible parts of software development that are difficult to measure but incredibly valuable beyond the code itself.

> Our job isn't to write code, it's to make the machine do the thing. All the effort for clean, manageable, etc is purely in the interest of the programmer but at the end of the day, launching the feature that pulls in money is the point.

This could be the vibe coder mantra. And it's true on day one. Once you've got reasonably complex software being maintained by one or more teams of developers who all need to be able to fix bugs and add features without breaking things, it's not quite as simple as "make the machine do the thing."

Loughla 12/3/2025|||
>AI is certainly not reliable enough for me to jeopardize the quality of my work by using it heavily.

I mean this in sincerity, and not at all snarky, but - have you considered that you haven't used the tools correctly or effectively? I find that I can get what I need from chatbots (and refuse to call them AI until we have general AI just to be contrary) if I spend a couple of minutes considering constraints and being careful with my prompt language.

When I've come across people in my real life who say they get no value from chatbots, it's because they're asking poorly formed questions, or haven't thought through the problem entirely. Working with chatbots is like working with a very bright lab puppy. They're willing to do whatever you want, but they'll definitely piss on the floor unless you tell them not to.

Or am I entirely off base with your experience?

dwoldrich 12/3/2025|||
It would be helpful if you would relate your own bad experiences and how you overcame them. Leading off with "do it better" isn't very instructive. Unfortunately there's no useful training for much of anything in our industry, much less AI.

I prefer to use LLM as a sock puppet to filter out implausible options in my problem space and to help me recall how to do boilerplate things. Like you, I think, I also tend to write multi-paragraph prompts repeating myself and calling back on every aspect to continuously hone in on the true subject I am interested in.

I don't trust LLM's enough to operate on my behalf agentically yet. And, LLM is uncreative and hallucinatory as heck whenever it strays into novel territory, which makes it a dangerous tool.

kentm 12/3/2025||||
> have you considered that you haven't used the tools correctly or effectively?

The problem is that this comes off just as tone-deaf as "you're holding it wrong." In my experience, when people promote AI, its sold as just having a regular conversation and then the AI does thing. And when that doesn't work, the promoter goes into system prompts, MCP, agent files, etc and entire workflows that are required to get it to do the correct thing. It ends up feeling like you're being lied to, even if there's some benefit out there.

There's also the fact that all programming workflows are not the same. I've found some areas where AI works well, but a lot of my work it does not. Usually things that wouldn't show up in a simple Google search back before it was enshittified are pretty spotty.

mattgreenrocks 12/3/2025|||
I suspect AI appeals very strongly to a certain personality type who revels in all the details in getting a proper agentic coding environment bootstrapped for AI to run amok in, and then supervises/guides the results.

Then there’s people like me, who you’d probably term as an old soul, who looks at all that and says, “I have to change my workflow, my environment, and babysit it? It is faster to simply just do the work.” My relationship with tech is I like using as little as possible, and what I use needs to be predictable and do something for me. AI doesn’t always work for me.

msikora 12/4/2025||
Yes, this rings true, it took me over a month to actually get to at least 1x of my usual productivity with Claude Code. There is a ton of setup and ton of things to learn and try to see what works. What to watch out for and how to babysit it so it doesn't go off the rails (quite heavy handed approach works best for me). It's kind of like a shitty, but very fast and very knowledgable junior developer. At this moment it still maybe isn't "worth it" for a lot of devs if productivity (and developer ergonomics) is the only goal, but it is clear to me that this is where the industry is heading and I think every dev will eventually have to get on board. These tools really just started to be somewhat decent this year. I'm 100% sure that in a year or two it will be the default for everyone in a way that you simply won't be able to compete without it at all. It would be like using a shovel instead of an excavator. Remember, right now is the worst it'll ever be.
Lerc 12/3/2025|||
> In my experience, when people promote AI, its sold as just having a regular conversation and then the AI does thing.

This is almost the complete opposite of my experience. I hear expressions about improvements and optimism for the future, but almost all of the discussion from active people productivly using AI is about identifying the limits and seeing what benefits you can find within those limits.

They are not useless and they are also not a panacea. It feels like a lot of people consider those the only available options.

jandrewrogers 12/3/2025||||
AI is okay (not great) at generating low- to mid-skill code. If you are working in a high-skill software domain that requires pervasive state-of-the-art or first-principles implementation then AI produces consistently terrible code. It frequently is flatly incorrect about esoteric technical details that really matter.

It can't reason from first principles and there isn't training data for a lot of state-of-the-art computer science and code implementations. Nothing you can prompt will make it produce non-naive output because it doesn't have that capability.

AI works for a lot of things because, if we are honest, AI generated slop is replacing human generated slop. But not all software is slop and there are software domains where slop is not even an option.

suprjami 12/5/2025|||
All good, no snark inferred. Yes I have considered this, and I keep considering it every time I get a bad result. Sorry this response is so long.

I think I have a good idea how these things work. I have run local LLMs for a couple of years on a pair of video cards here, trying out many open weight models. I have watched the 3blue1brown ML course. I have done several LinkedIn Learning courses (which weren't that helpful, just mandatory). I understand about prompting precisely and personas (though I am not sold personas are a good idea). I understand LLMs do not "know" anything, they just generate the next most likely token. I understand LLMs are not a database with accurate retrieval. I understand "reasoning" is not actual thinking just manipulating tokens to steer a conversation in vector space. I understand LLMs are better for some tasks (summarisation, sentiment analysis, etc) than others (retrieval, math, etc). I understand they can only predict what's in their training data. I feel I have a pretty good understanding of how to get results from LLMs (or at least the ways people say you can get results).

I have had some small success with LLMs. They are reasonably good at generating sub-100 line test code when given a precise prompt, probably because that is in training data scraped from StackOverflow. I did a certification earlier this year and threw ~1000 lines of Markdown notes into Gemini and had it quiz me which was very useful revision, it only got one question wrong of the couple of hundred I had it ask me.

I'll give a specific example of a recent failure. My job is mostly troubleshooting and reading code, all of which is public open source (so accessible via LLM search tooling). I was trying to understand something where I didn't know the answer, and this was difficult code to me so I was really not confident at all in my understanding. I wrote up my thoughts with references, the normal person I ask was busy so I asked Gemini Pro. It confidently told me "yep you got it!".

I asked someone else who saw a (now obvious) flaw in my reasoning. At some point I'd switched from a hash algorithm which generates Thing A, to a hash algorithm which generates Thing B. The error was clearly visible, one of my references had "Thing B" in the commit message title, which was in my notes with the public URL, when my whole argument was about "Thing A".

This wasn't even a technical or code error, it was a text analysis and pattern matching error, which I didn't see because I was so focused on algorithms. Even Gemini, the apparent best LLM in the world which is causing "code red" at OpenAI did not pick this up, when text analysis is supposed to be one of its core functionalities.

I also have a lot of LLM-generated summarisation forced on me at work, and it's often so bad I now don't even read it. I've seen it generate text which makes no logical sense and/or which uses so many words without really saying anything at all.

I have tried LLM-based products where someone else is supposed to have done all the prompt crafting and added RAG embeddings and I can just behave like a naive user asking questions. Even when I ask these things question which I know are in the RAG, they cannot retrieve an accurate answer ~80% of the time. I have read papers which support the idea that most RAG falls apart after about ~40k words and our document set is much larger than that.

Generally I find LLMs are at the point where to evaluate the LLM response I need to either know the answer beforehand so it was pointless to ask, or I need to do all the work myself to verify the answer which doesn't improve my productivity at all.

About the only thing I find consistently useful about LLMs is writing my question down and not actually asking it, which is a form of Rubber Duck Debugging (https://en.wikipedia.org/wiki/Rubber_duck_debugging) which I have already practiced for many years because it's so helpful.

Meanwhile trillions of dollars of VC-backed marketing assures me that these things are a huge productivity increaser and will usher in 25% unemployment because they are so good at doing every task even very smart people can do. I just don't see it.

If you have any suggestions for me I will be very willing to look into them and try them.

sulicat 12/3/2025||||
I'm probably one of the people that would say AI (at least LLMs) isn't all its cracked up to be and even I have examples where it has been useful to me.

I think the feeling stems from the exaggeration of the value it provides combined with a large number of internal corporate LLMs being absolute trash.

The overvaluation is seen in effect everywhere from the stock market, the price of RAM, the cost of energy as well as IP theft issues etc etc. AI has taken over and yet it still feels like just a really good fuzzy search. Like yeah I can search something 10x faster than before but might get a bad answer every now and then.

Yeah its been useful (so have many other things). No it's not worth building trillion dollar data centers for. I would be happier if the spend went towards manufacturing or semiconductor fabs.

rr808 12/3/2025||
Lol you made me think my power bill has gone up but I didn't get a pay rise for my increased productivity.
gaigalas 12/4/2025||||
I think it's more a continuation of IDE versus pure editor.

More precisely:

In one side, it's the "tools that build up critical mass" philosophy. AI firmly resides here.

On the other, it's the "all you need is brain and plain text" philosophy. We don't see much AI in this camp.

One thing I learned is that you should never underestimate the "all you need is brain and plain text" camp. That philosophy survived many, many "fatal blows" and has come up on top several times. It has one unique feature: resilience to bloat, something that the current smart tools camp is obviously overlooking.

the__alchemist 12/4/2025||||
Similar experience. I think it's become an identity politics concept. To those who consider themselves to be anti AI, the concept of the tool having any use is haram.

It feels awkward living in the "LLMs are a useful tool for some tasks" experience. I suspect this is because the two tribes are the loudest.

postalrat 12/3/2025||||
I see LLM's as kinda the new hotness in IDEs. And some people will use vi forever.
throwout4110 12/3/2025||||
Right this is what I can’t quite understand. A lot of HN folks appear to have been burned by e.g. horrible corporate or business ideas by non technical people that don’t understand AI, that is completely understandable. What I never understand is the population of coders that don’t see any value in coding agents or are aggressively against them, or people that deride LLMs as failing to be able to do X (or hallucinate etc) and are therefore useless and every thing is AI Slop, without recognizing that what we can do today is almost unrecognizeable from the world of 3 years ago. The progress has moved astoundingly fast and the sheer amount of capital and competition and pressure means the train is not slowing down. Predictions of “2025 is the year of coding agents” from a chorus of otherwise unpalatable CEOs was in fact absolutely true…
aisengard 12/3/2025|||
There is zero guarantee that these tools will continue to be there. Those of us who are skeptical of the value of the tools may find them somewhat useful, but are quite wary of ripping up the workflows we've built for ourselves over decade(s)(+) in favor of something that might be 10-20% more useful, but could be taken away or charged greater fees or literally collapse in functionality at any moment, leaving us suddenly crippled. I'll keep the thing I know works, I know will always be there (because it's open source, etc), even if it means I'm slightly less productive over the next X amount of time otherwise.
throwout4110 12/3/2025||
What would you imagine a plausible scenario would possibly be that your tools would be taken away or “collapse in functionality”? I would say Claude right now has probably made worse code and wasted time than if I had coded things myself, but it’s because this is like the first few hundred days of this. Open weight models are also worse but they will never go away and improve steadily as well. I am all for people doing whatever works for them I just don’t get the negativity or the skepticism when you look at the progress over what has been almost zero time. It’s crappy now in many respects but it’s like saying “my car is slow” in the one millisecond after I floor the gas pedal
watwut 12/3/2025|||
> What would you imagine a plausible scenario would possibly be that your tools would be taken away or “collapse in functionality”?

Simple. The company providing the tool needs actual earning suddenly. Therefore, they need to raise the prices. They also need users to spend more tokens, so they will make the tool respond in a way that requires more refinement. After all, the latter is exactly what happened with google search.

At this point, that is pretty normal software cycle - try to attract crowd by being free or cheap, then lock features behind paywall. Then simultaneously raise prices more and more while making the product worst.

This literally NEEDS to happen, because these companies do not have any other path to profitability. So, it will happen at some point.

throwout4110 12/3/2025||
Sure but you’re forgetting that competition exists. If anthropic investors suddenly say “enough” and demand positive cash flow it wouldn’t be that hard, everyone is capturing users for flywheels and capex for model improvements because if they don’t they will be guaranteed to lose.

It’s going to definitely be crappy, remember Google in 2003 with relevant results and no endless SEO , or Amazon reviews being reliable, or Uber being simple and cheap, etc. once growth phase ends monetization begins and experience declines but this is guard railed by the fact that there are many players.

watwut 12/4/2025||
Comsidering what I described is how tech companies actually function and functioned in the past, theoretical competition wont help.

They are competing themselves into massive unprofitability. Eventually they will die or do the above in cooperation. Maybe there will bw minor snandal about it, but that sort of collution is not prosecuted or seriously investigated if done by big companies.

So, it will happen exactly as it always happens with tech.

takluyver 12/3/2025|||
My understanding is that all the big AI companies are currently offering services at a loss, doing the classic Silicon Valley playbook of burning investor cache to get big, and then hope to make a profit later. So any service you depend on could crash out of the race, and if one emerges as a victorious monopoly and you rely on them, they can charge you almost whatever they like.

To my mind, the 'only just started' argument is wearing off. It's software, it moves fast anyway, and all the giants of the tech world have been feverishly throwing money at AI for the last couple of years. I don't buy that we're still just at the beginning of some huge exponential improvement.

ben_w 12/3/2025||
My understanding is they make a loss overall due to the spending on training new models, that the API costs are profit making if considered in isolation. That said, this is based on guestimates based on hosting costs of open-weight models, owing to a lack of financial transparancey everywhere for the secret-weights models.
blibble 12/3/2025||
> that the API costs are profit making if considered in isolation.

no, they are currently losing money on inference too

mjr00 12/3/2025||||
> Predictions of “2025 is the year of coding agents” from a chorus of otherwise unpalatable CEOs was in fact absolutely true…

... but maybe not in the way that these CEOs had hoped.[0]

Part of the AI fatigue is that busy, competent devs are getting swarmed with massive amounts of slop from not-very-good developers. Or product managers getting 5 paragraphs of GenAI bug reports instead of a clear and concise explanation.

I have high hopes for AI and think generative tooling is extremely useful in the right hands. But it is extremely concerning that AI is allowing some of the worst, least competent people to generate an order of magnitude more "content" with little awareness of how bad it is.

[0] https://github.com/ocaml/ocaml/pull/14369

justatdotin 12/4/2025||
> busy, competent devs are getting swarmed with massive amounts of slop from not-very-good developers

that is a real issue and yet a normal problem and so has an obvious response.

oh wow that PR

bigstrat2003 12/3/2025||||
> What I never understand is the population of coders that don’t see any value in coding agents or are aggressively against them, or people that deride LLMs as failing to be able to do X (or hallucinate etc) and are therefore useless and every thing is AI Slop, without recognizing that what we can do today is almost unrecognizeable from the world of 3 years ago.

I don't recognize that because it isn't true. I try the LLMs every now and then, and they still make the same stupid hallucinations that ChatGPT did on day 1. AI hype proponents love to make claims that the tech has improved a ton, but based on my experience trying to use it those claims are completely baseless.

ben_w 12/3/2025|||
> I try the LLMs every now and then, and they still make the same stupid hallucinations that ChatGPT did on day 1.

One of the tests I sometimes do of LLMs is a geometry puzzle:

  You're on the equator facing south. You move forward 10,000 km along the surface of the Earth. You are rotate 90° clockwise. You move another 10,000 km forward along the surface of the earth. Rotate another 90° clockwise, then move another 10,000 km forward along the surface of the Earth.

  Where are you now, and what direction are you facing?
They all used to get this wrong all the time. Now the best ones sometimes don't. (That said, only one to succed just as I write this comment was DeepSeek; the first I saw succeed was one of ChatGPT's models but that's now back to the usual error they all used to make).

Anecdotes are of course a bad way to study this kind of thing.

Unfortunately, so are the benchmarks, because the models have quickly saturated most of them, including traditional IQ tests (on the plus side, this has demonstrated that IQ tests are definitely a learnable skill, as LLMs loose 40-50 IQ points when going from public to private IQ tests) and stuff like the maths olympiad.

Right now, AFAICT the only open benchmarks are the METR time horizon metric, the ARC-AGI family of tests, and the "make me an SVG of ${…}" stuff inspired by Simon Willison's pelican on a bike.

Smaug123 12/3/2025||
Out of interest, was your intended answer "where you started, facing east"?

FWIW, Claude Opus 4.5 gets this right for me, assuming that is the intended answer. On request, it also gave me a Mathematica program which (after I fixed some trivial exceptions due to errors in units) informs me that using the ITRF00 datum the actual answer is 0.0177593 degrees north and 0.168379 west of where you started (about 11.7 miles away from the starting point) and your rotation is 89.98 degrees rather than 90.

(ChatGPT 5.1 Thinking, for me, get the wrong answer because it correctly gets near the South Pole and then follows a line of latitude 200 times round the South Pole for the second leg, which strikes me as a flatly incorrect interpretation of the words "move forward along the surface of the earth". Was that the "usual error they all used to make"?)

ben_w 12/4/2025|||
> Out of interest, was your intended answer "where you started, facing east"?

Or anything close to it so long as the logic is right, yes. I care about the reasoning failure, not the small difference between the exact quarter-circumferences of these great circles and 10,000km; (Not that it really matters, but now you've said the answer, this test becomes even less reliable than it already was).

> FWIW, Claude Opus 4.5 gets this right for me, assuming that is the intended answer.

Like I said, now the best ones sometimes don't [always get it wrong].

For me yesterday, Claude (albeit Sonnet 4.5, because my testing is cheap) avoided the south pole issue, but then got the third leg wrong and ended up at the north pole. A while back ChatGPT 5 (I looked the result up) got the answer right, yesterday GPT-5-thinking-mini (auto-selected by the system) got it wrong same way as you report on the south pole but then also got the equator wrong and ended up near the north pole.

"Never" to "unreliable success" is still an improvement.

xantronix 12/4/2025|||
Yeah, I'm pretty sure that's correct. Just whipped this up, using the WGS-84 datum.

  (use-modules (geo vincenty))
  
  (let walk ((p '(0 0 180))
             (i 0))
    (cond ((= i 3)
           (display p)
           (newline))
          (else
            (walk (apply vincenty
                         (list (car p) (cadr p) (+ 90 (caddr p)) 10000000))
                  (+ i 1)))))
Running this yields:

  (0.01777744062090717 0.16837322410251268 179.98234155229127)
Surely the discrepancy is down to spheroid vs sphere, yeah?
hectdev 12/3/2025||||
This fascinates me. Just observing but because it hasn't worked for you, everyone else must be lying? (I'm assuming that's what you mean by baseless)

How does that bridge get built? I can provide tangible real life examples but I've found push back from that in other online conversations.

queenkjuul 12/4/2025|||
My boss has been passing off Claude generated code and documentation to me all year. It is consistently garbage. It consistently hallucinates. I consistently have to rewrite most, if not all, of what I'm handed.

I do also try and use Claude Code for certain tasks. More often than not, i regret it, but I've started to zero in on tasks it's helpful with (configuration and debugging, not so much coding).

But it's very easy then for me to hear people saying that AI gives them so much useful code, and for me to assume that they are like my boss: not examining that code carefully, or not holding their output to particularly high standards, or aren't responsible for the maintenance and thus don't need to care. That doesn't mean they're lying, but it doesn't mean they're right.

hectdev 12/4/2025|||
Not everyone is your boss. I have 15 years of experience coding. So when the AI hallucinates, I call that out and it improves the code it does create. If someone is passing off Ai's first pass as done, they are not using the tool correctly.
queenkjuul 12/8/2025||
My boss has 28 years of experience coding so that clearly isn't the deciding factor here.

Yes, i suppose it is theoretically possible that you are that much better than my boss and i at coaxing good output from an LLM, but I'm going to continue to be skeptical until i see it with my own eyes.

int_19h 12/4/2025|||
"Claude Code" by itself is not specific enough; which model are we talking about?
bluefirebrand 12/4/2025||||
> it hasn't worked for you, everyone else must be lying?

Well, some non-zero amount of you are probably very financially invested in AI, so lying is not out of the question

Or you simply have blinders on because of your financial investments. After all, emotional investment often follows financial investment

Or, you're just not as good as you think you are. Maybe you're talking to people who are much better at building software than you are, and they find the stuff the AI builds does not impress them, while you are not as skilled so you are impressed by it.

There are lots of reasons someone might disagree without thinking everyone else is lying

hectdev 12/4/2025||
I think calling it baseless to claim benefits from AI is more than disagreeing. It's claiming a rightness that is just contrarian and hyperbolic. It's really interesting to me that the skeptics are exactly who should be using AI. Push back on it. Tell it that the code it made was wrong.
ggerni 12/3/2025|||
[flagged]
shepherdjerred 12/3/2025|||
What have you tried? How much time have you spent? Using AI is it’s own skill set separate from programming
elictronic 12/3/2025||||
AI is in a hype bubble that will crash just like every other bubble. The underlying uses are there but just like Dot Com, Tulips, subprime mortgages, and even Sir Isaac Newton's failings with the South Sea Company the financial side will fall.

This will cause bankruptcies and huge job losses. The argument for and against AI doesn't really matter in the end, because the finances don't make a lick of sense.

throwout4110 12/3/2025||
Ok sure the bubble/non-bubble stuff, fine, but in terms of “things I’d like to be a part of” it’s hard to imagine a more transformative technology (not to again turn off the anti-hype crowd). But ok, say it’s 1997, you don’t like the valuations you see. But as a tech person you’re not excited by browsers, the internet, the possibilities? You don’t want to be a part of that even if it means a bubble pops? I also hear a lot of people argue “finances don’t make a lick of sense” but i don’t think things are that cut and dried and I don’t see this as obvious. I don’t think really many people know how things will evolve and what size a market correction or bubble would have.
zdragnar 12/3/2025||
What precisely about AI is transformative, compared to the internet? E-mail replaced so much of faxing, phoning and physical mail. Online shopping replaced going to stores and hoping they have what you want, and hoping it is in stock, and hoping it is a good price. It replaced travel agents to a significant degree and reoriented many industries. It was the vehicle that killed CDs and physical media in general.

With AI I can... generate slop. Sometimes that is helpful, but it isn't yet at the point where it's replacing anything for me aside from making google searches take a bit less time on things that I don't need a definitive answer for.

It's popping up in my music streams now and then, and I generally hate it. Mushy-mouthed fake vocals over fake instruments. It pops up online and aside from the occasional meme I hate it there too. It pops up all over blogs and emails and I profoundly hate it there, given that it encourages the actual author to silence themselves and replaces their thoughts with bland drivel.

Every single software product I use begs me to use their AI integration, and instead of "no" I'm given the option of "not now", despite me not needing it, and so I'm constantly being pestered about it by something.

It has, thus far, made nearly everything worse.

throwout4110 12/3/2025||
> With AI I can... generate slop. Sometimes that is helpful, but it isn't yet at the point where it's replacing anything for me aside from making google searches take a bit less time on things that I don't need a definitive answer for.

I think this is probably the disconnect, this seems so wildly different from my experience. Not only that, I’ll grant that there are a ton of limitations still but surely you’d concede that there has been an incredible amount of progress in a very short time? Like I can’t imagine someone who sits down with Claude like I do and gets up and says “this is crap and a fad and won’t go anywhere”.

As for generated content, I again agree with you and you’d be surprised to learn that _execs_ agree with you but look at models from 1, 2, 3 years ago and tell me you don’t see a frightening progression of quality. If you want to say “I’ll believe it when I see it” that’s fine but my god just look at the trajectory.

For AI slop text, once again agree, once again I think we all have to figure out how to use it, but it is great for e.g. helping me rewrite a wordy message quickly, making a paper or a doc more readable, combining my notes into something polished, etc, and it’s getting better and better and better.

So I disagree it has made everything worse but I definitely agree that it has made a lot of things worse and we have a lot of Pets.com ideas that are totally not viable today, but the point I think people are maybe missing (?) is that it’s not about where we are it’s about the velocity and the future. You may be terrified and nauseated by $1T in capex on AI infra, fine but what that tells you is the scale is going to grow even further _in addition_ to the methodological / algorithmic improvements to tackle things like continual learning, robustness, higher quality multimodal generation with e.g. true narrative consistency, etc etc etc. in 5 years I don’t think many people will think of “slop” so negatively

zdragnar 12/3/2025||
Where you see exponential growth in capability and value, I see the early stages of logarithmic growth.

A similar thing played out a bit with IoT and voice controlled systems like Alexa. They've got their places, but nobody needs or wants the Amazon Dash buttons, or for Alexa to do your shopping for you.

Setting an alarm or adding a note to a list is fine, remote monitoring is fine, but when it comes to things that really matter like spending money autonomously, it completely falls flat.

Long story short, I see a fad that will fall into the background of what people actually do, rather than becoming the medium that they do it by.

throwout4110 12/3/2025||
I could not disagree more but you are far from alone and I respect a lot of the reasons I’ve gathered why you and others have this belief
skywhopper 12/3/2025||||
Maybe those people do different work than you do? Coding agents don’t work well in every scenario.
throwout4110 12/3/2025||
Yet people imply that because it doesn’t work in their scenario that it’s not good?
anthem2025 12/3/2025||||
[dead]
jimbokun 12/3/2025|||
Most of the people against “AI” are not against it because they think it doesn’t work.

It’s because they know it works better every day and the people controlling it are gleefully fucking over the rest of the world because they can.

The plainly stated goal is TO ELIMINATE ALL HUMAN EMPLOYEES, with no plan for how those people will feed, clothe, or house themselves.

The reactions the author was getting was the reaction of a horse talking to someone happily working for the glue factory.

IAmBroom 12/3/2025||
I don't think you're qualified to speak for most of the people against AI.
icedchai 12/3/2025|||
My experience is the productivity gains are negative to neutral. Someone else basically wrote that the total "work" was simply being moved from one bucket to another. (I can't find the original link.)

Example: you might spend less time on initial development, but more time on code review and rework. That has been my personal experience.

mips_avatar 12/3/2025|||
The thing that changed my view on LLMs was solo traveling for 6 months after leaving Microsoft. There were a lot of points on the trip where I was in a lot of trouble (severe food sickness, stolen items, missed flights) where I don't know how I would have solved those problems without chatGPT helping.
buildsjets 12/3/2025|||
This is one of the most depressing things I have ever read on Hacker News. You claim to have become so de-actualized as a human being that you cannot fulfill the most basic items of Maslow’s Hierarchy of Needs (food, health, personal security, shelter, transportation) without the aid of an LLM.
mips_avatar 12/3/2025|||
IDK I got really sick in a foreign country, I wasn't sure how to get to the hospital and I was alone in a hotel room. I don't really know how using chatgpt to help me isn't actualizing.
erentz 12/3/2025|||
We used to have Google search and Google maps which solved this problem of finding information about symptoms and finding medical centers near you. LLM doesn’t make anything better it just confidently asserts things about medicine that may be wrong and always need to be verified with the real sources anyway.
mips_avatar 12/4/2025||
Well google search is a little nerfed since it went full ad revenue focused
ajkjk 12/3/2025||||
If you are operating under the constraint that talking to strangers is impossible then I could see why ChatGPT feels like a godsend...
blibble 12/3/2025||||
did you try asking at the reception desk?
simplyluke 12/4/2025|||
Growing up in the internet age (I'm 28 now) it took me until well into my 20s to realize how many classes of problems can be solved in 30 seconds on a phone call vs hours on a computer.
mips_avatar 12/4/2025|||
The hotel owner eventually half carried me to the hospital because I got so weak from dehydration, though I'm glad I left my hotel room when I did I had difficulty avoiding fainting.
cuu508 12/4/2025||
Sounds like it was the hotel owner not chatgpt who saved your ass in the end.
mips_avatar 12/4/2025||
Rafael was the absolute best. He also made sure the hospital saw me right away since I was so weak. But once I was hooked up I used ChatGPT to scan the ivs they had me hooked up to since I had no idea what they were pumping into me since it was all in Spanish.
pseudalopex 12/4/2025||
You said you solved problems with ChatGPT's help. You described a problem Rafael and hospital staff solved for you. And the problem you solved with ChatGPT could have been solved with a dictionary.
mips_avatar 12/4/2025||
I guess this was very American of me, but when I was so sick I wanted to know if my travel insurance would cover the hospital stay. ChatGPT confirmed that it did, and told me to get to a hospital. Ultimately the hotel owner was the person who carried me to the hospital, but i wasn't lucid enough to read through my travel insurance's benefits pdf. I suppose I should have just gone to the hospital with or without insurance, but sometimes when you're very sick you don't think straight.
pseudalopex 12/10/2025||
You said you used ChatGPT to help you get to the hospital. Then you said you used it to translate pharmaceutical names. Then you said you asked it about your travel insurance. These could be true together. But it says more of you than LLMs if you could not imagine how to solve any of these problems without LLMs.

Your travel insurance did not have emergency phone service?

LLMs are not reliable for medical advice or document questions.

bluefirebrand 12/4/2025|||
I can't imagine being in this situation and thinking "I will ask ChatGPT" instead of "I will ask the people at the front desk of this hotel I'm staying at"
ohyoutravel 12/3/2025||||
I saw a post on Reddit the other day where the user was posting screenshots from ChatGPT about how they were using ChatGPT as a “Human OS” and outsourcing all decisions and information to ChatGPT. It made me queasy.
int_19h 12/4/2025|||
So basically Manna, but for your life?
simianwords 12/4/2025|||
god forbid you outsource easy things to technology - that's how humanity has been progressing since forever. but sure, throw away your calculator and do it by hand if that makes you feel any better.
prewett 12/4/2025||
Well, if your calculator has a loose wire that sometimes flips a random bit somewhere, you might find that a slide rule that is consistently correct has a certain value.
canjobear 12/3/2025||||
Extremely uncharitable reading. Plausibly they were in a foreign country where they didn't speak any of the language and didn't know how anything worked. This kind of situation was never easy for anyone.
dreamcompiler 12/4/2025||
I have been in situations where either I or someone in my party was sick and needed medical care in a foreign country where I didn't speak the language. In all cases I used my brain to figure out a solution quickly without the aid of CharGPT, and the trip continued on.

This falls in the category of life skills or maybe just "adulting." Sure, maybe ChatGPT can be considered a life skill, but you need others compiled into your brain to fall back on when it fails. If ChatGPT is the only skill you have, what do you do if your phone gets stolen?

simianwords 12/4/2025||
What a strange thing to say.. ChatGPT is not a skill, its just a tool. It helps much faster and better than google searching. Why the fuss about it?

Would you say the same to someone using Google?

"Sure, maybe Google can be considered a life skill, but you need others compiled into your brain to fall back on when it fails. If Google is the only skill you have, what do you do if your phone gets stolen?"

simianwords 12/4/2025||||
Your post is actually one of the most patronising things I have read. The person just used ChatGPT like Google to solve their problems and your reply is about Maslow's Hierarchy of Needs?
mountainriver 12/4/2025||||
Relax
turtlesdown11 12/3/2025|||
[flagged]
nostrademons 12/3/2025||||
This is what people used to use Google for; I remember so many times between 2000-2020 that Google saved my bacon for exactly those things (travel plans, self-diagnosis, navigating local bureaucracies, etc.)

It's a sad commentary on the state of search results and the Internet now that ChatGPT is superior, particularly since pre-knowledge-panel/AI-overview Google was superior in several ways (not hallucinating, for one, and being able to triangulate multiple sources to tell the truth).

miltonlost 12/3/2025||||
Severe food sickness? I know WebMD rightly gets a lot of hate, but this is one thing where it would be good for. Stolen items? Depending on the items and the place, possibly police. Missed flights? Customer service agent at the airport for your airline or call the airline help line.
mips_avatar 12/3/2025||
Well I got so weak I needed to go to the hospital, and that was tough.
johnnienaked 12/4/2025|||
That's pretty sad tbh
Xeronate 12/3/2025|||
Is it true that it's bad for learning new skills? My gut tells me it's useful as long as I don't use it to cheat the learning process and I mainly use it for things like follow up questions.
deaux 12/4/2025|||
It is, it can be an enormous learning accelerator for new skills, for both adults and genuinely curious kids. The gap between low and high performancer will explode. I can tell you that if I had LLMs I would've finished schooling at least 25% quicker, while learning much more. When I say this on HN some are quick to point out the fallibility of LLMs, ignoring that the huge majority of human teachers are many times more fallible. Now this is a privileged place where many have been taught by what is indeed the global top 0.1% of teachers and professors, so it makes more sense that people would respond this way. Another source of these responses is simply fear.

In e.g. the US, it's a huge net negative because kids aren't probably taught these values and the required discipline. So the overwhelming majority does use it to cheat the learning process.

I can't tell you if this is the same inside e.g. China. I'm fairly sure it's not nearly as bad though as kids there derive much less benefit from cheating on homework/the learning process, as they're more singularly judged on standardized tests where AI is not available.

sfink 12/4/2025||
Fallibility isn't the problem. It's probably a net benefit for learning.

Promoting dependency is the problem. Replacing effort is the problem. Making self-discipline be a thing only for suckers is the problem.

deaux 12/4/2025||
I don't get this line of thinking. Never in my life have I heard the reasoning "replacing effort is the problem" when talking about children who are able to afford 24/7 brilliant private tutors. Having access to that has always been seen as an enormous privilege.
sfink 12/4/2025|||
Having an actual human who is a "brilliant private tutor" is an enormous privilege. A chatbot is not a brilliant private tutor. It is a private tutor, yes, but if it were human it would be guilty of malpractice. It hands out answers but not questions. A tutor's job is to cause the child to learn, to be able to answer similar questions. A standard chatbot's job is to give the child the answer, thus removing the need to learn. Learning can still happen, but only if the child forces it themselves.

That's not to say that a chatbot couldn't emulate a tutor. I don't know how successful it would be, but it seems like a promising idea. In actual practice, that is not how students are using them today. (And I'd bet that if you did have a tutor chatbot, that most students would learn much more about jailbreaking them to divulge answers than they would about the subject matter.)

As for this idea that replacing effort not being a problem, I suggest you do some research because that is everywhere. Talk to a teacher. Or a psychologist, where they call it "depth of processing" (which is a primary determinant of how much of something is incorporated, alongside frequency of exposure). Or just go to a gym and see how many people are getting stronger by paying 24/7 brilliant private weightlifters to do the lifting for them.

Xeronate 12/4/2025|||
Regarding your concerns about tutor emulation, your argument seems to be students use chatbots as a way to cheat rather than as a tutor.

My pushback is its very easy to tell a chatbot to give you hints that lead to the answer and to get deeper understanding by asking follow up questions if that's what you want. Cheating vs putting in work has always been something students have to choose between though and I don't think AI is going to change the amount of students making each choice (or if it does it won't be by a huge percentage). The gap in skills between the groups will grow, but there will still be a group of people that became skilled because they valued education and a group that cheated and didn't learn anything.

deaux 12/5/2025|||
> A standard chatbot's job is to give the child the answer, thus removing the need to learn.

An LLM's job is not to give the child the answer (implying "the answer to some homework/exam question"), it's to answer the question that was asked. A huge difference. If you ask it to ask a question, it will do so. Over the next 24 hours as of today, December 5th 2025, hundreds of thousands of people will write a prompt that includes exactly that - "ask me questions".

> Learning can still happen, but only if the child forces it themselves.

This is literally what my original comment said, although "forcing" it is pure negative of a framing; rather "learning can still happen, if the child wants to". See this:

>In e.g. the US, it's a huge net negative because kids aren't probably taught these values and the required discipline. So the overwhelming majority does use it to cheat the learning process.

I never claimed that replacing effort isn't necessarily a problem either, just that such a downside has never been brought up in the context of access to a brilliant tutor, yet suddenly an impossible-to-overcome issue when it comes to LLMs.

hgomersall 12/4/2025|||
I learnt the most from bad teachers#, but only when motivated. I was forced to go away and really understand things rather than get a sufficient understanding from the teacher. I had to put much more effort in. Teachers don't replace effort, and I see no reasons LLMs will change that. What they do though is reduced the time to finding the relevant content, but I expect at some poorly defined cost.

# The truly good teachers were primarily motivation agents, providing enough content, but doing so in a way that meant I fully engaged.

loveparade 12/4/2025||||
I think what it comes down to, and where many people get confused, is separating the technology itself from how we use it. The technology itself is incredible for learning new skills, but at the same time it incentivizes people to not learn. Just because you have an LLM doesn't mean you can skip the hard parts of doing textbook exercises and thinking hard about what you are learning. It's a bit similar to passively watching youtube videos. You'd think that having all these amazing university lectures available on youtube makes people learn much faster, but in reality in makes people lazy because they believe they can passively sit there, watch a video, do nothing else, and expect that to replace a classroom education. That's not how humans learn. But it's not because youtube videos or LLMs are bad learning tools, it's because people use them as mental shortcut where they shouldn't.
sfink 12/4/2025||
I fully agree, but to be fair these chatbots hack our reward systems. They present a cost/benefit ratio where for much less effort than doing it ourselves we get a much better result than doing it ourselves (assuming this is a skill not yet learned). I think the analogy to calculators is a good one if you're careful with what you're considering: calculators did indeed make people worse at mental math, yet mental math can indeed be replaced with calculators for most people with no great loss. Chatbots are indeed making people worse at mental... well, everything. Thinking in general. I do not believe that thinking can be replaced with AI for most people with no great loss.
Aerroon 12/4/2025|||
I found it useful for learning to write prose. There's nothing quite like instantaneous feedback when learning. The downside was that I hit the limit of the LLM's capabilities really quickly. They're just not that good at writing prose (overly flowery and often nonsensical).

LLMs were great for getting started though. If you've never tried writing before, then learning a few patterns goes a long way. ("He verbed, verbing a noun.")

jliptzin 12/4/2025|||
My friends and I have always wondered as we've gotten older what's going to be the new tech that the younger generation seems to know and understand innately while the older generations remain clueless and always need help navigating (like computers/internet for my parents' generation and above). I am convinced that thing is AI.

Kids growing up today are using AI for everything, whether or not that's sanctioned or if it's ultimately helpful or harmful to their intellectual growth. I think the jury is still out on that. But I do remember growing up in the 90s, spending a lot of time on the computer, older people would remark how I'll have no social skills, I won't be able to write cursive or do arithmetic in my head, won't learn any real skills, etc, turns out I did just fine and now those same people always have to call me for help when they run into the smallest issue with technology.

I think a lot of people here are going to become roadkill if they refuse to learn how to use these new tools. I just built a web app in 3 weeks with only prompts to Claude Code, I didn't write a single line of code, and it works great. It's pretty basic, but probably would have taken me 3+ months instead of 3 weeks doing it the old fashioned way. If you tried it once a year ago and have written it off, a lot has changed since then and the tools continue to improve every month. I really think that eventually no one will be checking code just like hardly anyone checks the assembly output of a compiler anymore.

You have to understand how the context window works, how to establish guardrails so you're not wasting time repeating the same things over and over again, force it to check its own work with lots of tests, etc. It's really a game changer when you can just say in one prompt "write me an admin dashboard that displays users, sessions, and orders with a table and chart going back 30 days" or "wire up my site for google analytics, my tag code is XXXXXXX" and it just works.

queenkjuul 12/4/2025||
The thing is, Claude Code is great for unimportant casual projects, and genuinely very bad at working in big, complex, established projects. The latter of course being the ones most people actually work on.

Well either it's bad at it, or everyone on my team is bad at prompting. Given how dedicated my boss has been to using Claude for everything for the past year and the output continuing to be garbage, though, i don't think it's a lack of effort on the team's part, i have to believe Claude just isn't good at my job.

sodafountan 12/4/2025|||
I was going to try having an AI agent analyze a well-established open source project. I was thinking of trying something like Bitcoin Core or an open-source JavaScript library, something that has had a lot of human eyes on it. To me, that seems like a good use case, as some of those projects can get pretty complex in what they're aiming to accomplish. Just the sheer amount of complexity involved in Bitcoin, for instance, would be a good candidate for having an AI agent explain the code to you as you're reviewing it. A lot of those projects are fairly well-written as they are, with the higher-level concepts being the more difficult thing to grasp.

Not attempting to claim anything against your company, but I've worked for enterprises where code bases were a complete mess and even the product itself didn't have a clear goal. That's likely not the ideal candidate for AI systems to augment.

queenkjuul 12/8/2025||
Frankly, the code isn't messy whatsoever. There's just lots of it, and it's necessarily complex due to the domain. It's honestly the best codebase I've ever worked with - i shudder to think what nonsense Claude would spew trying to contextualize the spaghetti at my last job
pennomi 12/4/2025|||
As context size increases, AI becomes exponentially dumber. Most established software is far, FAR too large for AI. But small, greenfield projects are amazing for something like Claude Code.
SubiculumCode 12/4/2025||
This is why I argue that the impact of LLMs is in the tail. Its all the small to midsize shops that want something done, but don't have money to hire a programmer. Its small tasks, like pushing data around, writing a quick interface to help day to day jobs in niche jobs and technical problems. Its the ability to quickly generate prototype logos and scripts for small scale ad campaigns, for solving Nancy's Excel issue, etc. Big companies have big software and code stacks with tons of dependencies. Small shops have little project needs that solve significant issues facing their operations, but will unlikely become large enough that things like scaling issues, maintenance, integration, are ever a problem at all. Its a tail, but its long in small to midsize businesses. In research labs, which I have personal experience, AI is rapidly making feasible more ambitious projects, quicker timelines, and better code, generally.
mark_l_watson 12/4/2025|||
I basically agree. OK: Small focused models for specific use cases, small models like the new mistral-3-3B that I found today to be good at tool use I and thus for building narrow ranged applications.

I have been mostly been paid to work on AI projects since 1982, but I want to pull my hair out and scream over the big push in the USA to develop super-AGI. Such a waste of resources and such a hit on society that needs resources used for better purposes.

forgotoldacc 12/4/2025|||
As a gamedev, there's nothing I hate more than AI concept art. It's always soulless. The best thing about games is there's no limit to human imagination, and you can make whatever you want. But when we leave the imagination stage to a computer then leave the final brushing up to humans, we're getting the order completely backwards. It's bonkers and just disgusting to me.

That said, game engine documentation is often pretty hard to navigate. Most of the best information is some YouTube video recorded by some savant 15 year old with a busted microphone. And you need to skim through 30 minutes of video until you find what you need. The biggest problem is not knowing what you don't know, so it's hard to know where to begin. There are a lot of things you may think you need to spend 2 days implementing, but the engine may have a single function and a couple built in settings to do it.

Where LLMs shine is that I can ask a dumb question about this stuff, and can be pointed in the right direction pretty quickly. The implementation it spits out is often awful (if not unusable), but I can ask a question and it'll name drop the specific function and setting names that'll save me a lot of work. And from there, I know what to look up and it's a clear path from there.

And gamedev is a very strong case of not needing a correct solution. You just need things to feel right for most cases. Games that are rough around the edges have character. So LLM assistance for implementation (not art) can be handy.

KPGv2 12/4/2025|||
> must be plausible, need not be accurate

This includes IME the initial stages of art creation (the planning, not generating, stage). It's kind of like having someone to bounce ideas off of at 3am. It's a convenient way of trigging your own brain to be inspired.

eru 12/4/2025|||
> [...] a large number of places where LLMs make apparently-effective tools that have negative long-term consequences (see: anything involving learning a new skill, [...]

Don't people learn from imperfect teachers all the time?

sfink 12/4/2025||
Yes, they do. In fact, imperfect teachers can sometimes induce more learning than more perfect ones. And that's what is insidious about learning from AI. It looks like something we've seen before, something where we know how to make it useful and take advantage even of the gaps and inadequacies.

AI can be effective for learning a new skill, but you have to be constantly on your guard to prevent it from hacking your brain and making you helpless and useless. AI isn't the parent holding your bicycle and giving you a push and letting go when you're ready. It's the welded-on training wheels that become larger and more structurally necessary until the bike can't roll forward at all without them. It feeds you the lie that all you need is the theory, you don't ever need to apply it because the AI will do that for you so don't worry your pretty little head over it. AI teaches you that if something requires effort, you're just not relying on the AI enough. The path to success goes only through AI, and those people who try to build their own skills without it are suckers because the AI can effortlessly create things 100x bigger and better and more complex.

Personally, I still believe that human + AI hybrids have enormous potential. It's just that using AI constantly pushes away from beneficial hybridization and towards dependency. You have to constantly fight against your innate impulses, because it hacks them to your detriment.

I'd actually like to see an AI trained to not give answers, but to search out the point where they get you 90% of the way there and then steadfastly refuse to give you the last 10%. An AI built with the goal not of producing artifacts or answers, but of producing learning and growth in the user. (Then again, I'd like to see the same thing in an educational system...)

eru 12/4/2025||
> Personally, I still believe that human + AI hybrids have enormous potential. [...]

That was true in chess for a long time, but since at least 20 years or so, approximately anytime the human deviates from what the AI suggests, it's a mistake.

sfink 12/4/2025||
It turns out that with enough effort, "chess" can be lumped in with "arithmetic". The machines are just better at it. We're continually finding new things in that category, including things we never would have guessed. But that doesn't mean that everything is. At least right now, very little is.

Even things that AI has gotten best at, like coding, are nowhere near that category yet. AI-written text and code is still crap compared to what humans can write. Both can often superficially look better, but the closer you look and the less a human guided it, the worse you discover it is.

eru 12/5/2025||
I suspect you are comparing AI output to the best human output?

Chess bots could beat the vast majority of humans at their game long before they could beat the world champion.

Similarly, AI generated code and text and images etc are getting more and more competitive with what regular humans can produce. Especially if you take speed and cost into account.

xg15 12/3/2025|||
Not sure if that's also category #2 or a new one, but also: Places where AI is at risk of effectively becoming a drug and being actively harmful for the user: Virtual friends/spouses, delusion-confirming sycophants, etc.
WhyOhWhyQ 12/3/2025|||
I also would like to see AI end up dying off except for a few niches, but I find myself using it more and more. It is not a productivity boost in the way I end up using it, interestingly. Actually I think it is actively harming my continued development, though that could just be me getting older, or perhaps massive anxiety from joblessness. Still, I can't help but ask it if everything I do is a good idea. Even in the SO era I would try to find a reference for every little choice I made to determine if it was a good or bad practice.
SpaceNoodled 12/3/2025||
That honestly sounds like addiction.
wkat4242 12/4/2025||
I also hoped it would crash and burn. The real value added usecases will remain. The overhyped crap won't.

But the shockwave will cause a huge recession and all those investors that put up trillions will not take their losses. Rich people never get poorer. One way or another us consumers will end up paying for their mistakes. Either by huge inflation, job losses, energy costs, service enshittification whatever. We're already seeing the memory crisis having huge knock on effects with next year's phones being much more expensive. That's one of the ways we are going to be paying for this circus.

I really see value in it too, sure. But the amount of investment that goes into it is insane. It's not that valuable by far. LLMs are not good for everything and the next big thing is still a big question mark. AI is dragged in by the hair into usecases where it doesn't belong. The same shit we saw with blockchains, but now on a world crashing scale. It's very scary seeing so much insanity.

But anyway whatever I think doesn't matter. Whatever happens will happen.

JumpCrisscross 12/3/2025||
> AI-powered map

> none of it had anything to do with what I built. She talked about Copilot 365. And Microsoft AI. And every miserable AI tool she's forced to use at work. My product barely featured. Her reaction wasn't about me at all. It was about her entire environment.

She was given two context clues. AI. And maps. Maps work, which means all the information in an "AI-powered map" descriptor rests on the adjective.

Freak_NL 12/3/2025|
The product website isn't convincing either. It's only in private beta, and the first example shows 'A scenic walking tour of Venice' as the desired trip. I'll readily believe LLMs will gladly give you some sort of itinerary for walking in Venice, including all highlights people write and post about a lot on social media to show how great their life is. But if you asked anyone knowledgable about travel in that region, the counter questions would be 'Why Venice specifically? I thought you hated crowds — have you considered less crowded alternatives where you will be appreciated more as a tourist? Have you actually been to Italy at all?'.

LLMs are always going to give you the most plausible thing for your query, and will likely just rehash the same destinations from hundreds of listicles and status signalling social media posts.

She probably understood this from the minimal description given.

JumpCrisscross 12/3/2025|||
> I'll readily believe LLMs will gladly give you some sort of itinerary for walking in Venice

I tried this in Crotone in September. The suggested walking tour was shit. The facts weren't remarkable. The stops were stupid and stupidly laid out. The whole experience was dumb and only redeeming because I was vacationing with a friend who founded on the of the AI companies.

> if you asked anyone knowledgable about travel in that region, the counter questions would be 'Why Venice specifically?

In the region? Because it's a gorgeous city with beautiful architecture, history and festivals?

Freak_NL 12/4/2025||
> In the region? Because it's a gorgeous city with beautiful architecture, history and festivals?

That would be a great answer to continue from. Would you come for the Biennale specifically? Do you care greatly about sustainability? Would you enjoy yourself more in a different gorgeous city without the mass-tourism problem if that meant you would feel more welcome? Is there a way you can visit Venice without contributing to the issue as much? Off-season perhaps?

Venice is unique, but there are a lot of gorgeous places in the region, from Verona to Trieste.

drivebyhooting 12/3/2025|||
If it’s your first time going to Italy you absolutely should visit Venice. The crowds are unpleasant, but so what? Are you going to avoid Rome too? Only go to little provincial villages?
Freak_NL 12/4/2025||
Why should you absolutely visit Venice? It's not just the crowds that are unpleasant, you are actively contributing to a problem.

No, you don't have to avoid Rome — it's not as bad as Venice, and can support more people — but plan ahead and don't just do a tour of all the 'must see' highlights. Look into the off season if you are a history buff with a hyperfocus on Rome — you won't be able to finish your list otherwise due to all the pointless waiting around.

And yes, visit provincial villages and eat in an authentic Italian restaurant where tourists are mostly other Italians. Experience the difference. But you are not limited to villages. Italy is huge, and there are a lot of cities with remarkable museums, world-renowned festivals, great cuisine, and where your money is more than welcome and your stay won't be marred by extreme crowds and pushy con artists in faux Roman gladiator gear.

spit2wind 12/3/2025||
> Bring up AI in a Seattle coffee shop now and people react like you're advocating asbestos.

I don't know who first uses the asbestos analogy, but it's 1000% on point.

I think Cory Doctrow says it best,

"AI is the asbestos we're shoveling into the walls of our society — and our descendants will be digging it out for generations."

I believe that's exactly the language to combat AI hype.

nullbound 12/3/2025||
'If you could classify your project as "AI," you were safe and prestigious. If you couldn't, you were nobody. Overnight, most engineers got rebranded as "not AI talent."'

It hits weirdly close to home. Our leadership did not technically mandate use, but 'strongly encourages' it. I did not even have my review yet, but I know that once we get to the goals part, use of AI tools will be an actual metric ( which is.. in my head somewhere between skeptic and evangelist.. dumb ).

But the 'AI talent' part fits. For mundane stuff like data model, I need full committee approval from people, who don't get it anyway ( and whose entire contribution is: 'what other companies are doing' ).

kg 12/3/2025||
I know of at least one bigco that will no longer hire anyone, period, who doesn't have at least 6 months of experience using genai to code and isn't enthusiastic about genai. No exceptions. I assume this is probably true of other companies too.

I think it makes some amount of sense if you've decided you want to be "an AI company", but it also makes me wary. Apocryphally Google for a long period of time struggled to hire some people because they weren't an 'ideal culture fit'. i.e. you're trying to hire someone to fix Linux kernel bugs you hit in production, but they don't know enough about Java or Python to pass the interview gauntlet...

empressplay 12/3/2025||
Like any tool, the longer you use it the better you learn where you can extract value from it and where you can't, where you can leverage it and where you shouldn't. Because your behaviour is linked to what you get out of the LLM, this can be quite individual in nature, and you have to learn to work with it through trial and error. But in the end engineers do appear to become more productive 'pairing' with an LLM, so it's no surprise companies are favouring LLM-savvy engineers.
bigstrat2003 12/3/2025|||
> But in the end engineers do appear to become more productive 'pairing' with an LLM

Quite the opposite: LLMs reduce productivity, they don't increase it. They merely give the illusion of productivity because you can generate code real fast, but that isn't actually useful when you spend time fixing all the mistakes it made. It is absolutely insane that companies are stupid enough to require people use something which cripples them.

sleepybrett 12/3/2025|||
So far, for me, it's just an annoying tool that gets worse outcomes potentially faster than just doing it by hand.

It doesn't matter how much I use it. It's still just an annoying tool that makes mistakes which you try to correct by arguing with it but then eventually just fix it yourself. At best it can get you 80% there.

beloch 12/3/2025||
The full quote from that section is worth repeating here.

---------

"If you could classify your project as "AI," you were safe and prestigious. If you couldn't, you were nobody. Overnight, most engineers got rebranded as "not AI talent." And then came the final insult: everyone was forced to use Microsoft's AI tools whether they worked or not.

Copilot for Word. Copilot for PowerPoint. Copilot for email. Copilot for code. Worse than the tools they replaced. Worse than competitors' tools. Sometimes worse than doing the work manually.

But you weren't allowed to fix them—that was the AI org's turf. You were supposed to use them, fail to see productivity gains, and keep quiet.

Meanwhile, AI teams became a protected class. Everyone else saw comp stagnate, stock refreshers evaporate, and performance reviews tank. And if your team failed to meet expectations? Clearly you weren't "embracing AI." "

------------

On the one hand, if you were going to bet big on AI, there are aspects of this approach that make sense. e.g. Force everyone to use the company's no-good AI tools so that they become good. However, not permitting employees outside of the "AI org" to fix things neatly nixes the gains you might see while incurring the full cost.

It sounds like MS's management, the same as many other tech corp's, has become caught up in a conceptual bubble of "AI as panacea". If that bubble doesn't pop soon, MS's products could wind up in a very bad place. There are some very real threats to some of MS's core incumbencies right now (e.g. from Valve).

zmmmmm 12/3/2025||
As a place with a high density of people with agency to influence the outcome, I think it's important for people here to acknowledge that much of what the negative people think is probably 100% true.

There will absolutely some cases where AI is used well. But probably the larger fraction will be where AI does not give better service, experience or tool. It will be used to give a cheaper but shittier one. This will be a big win for the company or service implementing it, but it will suck for literally everybody else involved.

I really believe there's huge value in implementing AI pervasively. However it's going to be really hard work and probably take 5 years to do it well. We need to take an engineering and human centred approach and do it steadily and incrementally over time. The current semi-religious fervour about implementing it rapidly and recklessly is going to be very harmful in the longer term.

themafia 12/3/2025||
Instead of admitting you built the wrong thing you denigrate a friend and someone whom you admire. Instead of reconsidering the value of AI you immediately double down.

This is a product of hurt feelings and not solid logic.

vessenes 12/3/2025|
Thanks for the post - it's work to write and synthesize, and I always appreciate it!

My first reaction was "replace 'AI' with the word 'Cloud'" ca 2012 at MS; what's novel here?

With that in mind, I'm not sure there is anything novel about how your friend is feeling or the organizational dynamics, or in fact how large corporations go after business opportunities; on those terms, I think your friends' feelings are a little boring, or at least don't give us any new market data.

In MS in that era, there was a massive gold rush inside the org to Cloud-ify everything and move to Azure - people who did well at that prospered, people who did not, ... often did not. This sort of internal marketplace is endemic, and probably a good thing at large tech companies - from the senior leadership side, seeing how employees vote with their feet is valuable - as is, often, the directional leadership you get from a Satya who has MUCH more information than someone on the ground in any mid-level role.

While I'm sure there were many naysayers about the Cloud in 2012, they were wrong, full stop. Azure is immensely valuable. It was right to dig in on it and compete with AWS.

I personally think Satya's got a really interesting hyper scaling strategy right now -- build out national-security-friendly datacenters all over the world -- and I think that's going to pay -- but I could be wrong, and his strategy might be much more sophisticated and diverse than that; either way, I'm pretty sure Seattleites who hate how AI has disrupted their orgs and changed power politics and winners and losers in-house will have to roll with the program over the next five years and figure out where they stand and what they want to work on.

mips_avatar 12/3/2025||
It does feel like without a compelling AI product Microsoft isn't super differentiated. Maybe Satya is right that scale is a differentiation, but I don't think people are as trapped in an AI ecosystem as they were in Azure.
vessenes 12/4/2025|||
Their hyper scale data centers are super compelling. And they get OpenAI IP for some time. I don’t think we’ve really seen what they want to launch on the product side yet.

Satya mentioned recently that computer use agents use like 5x the windows license time on azure over a single person - they see a lotttt of inference growth coming and its multiplicative in that it uses their compute and azure infra.

ilaksh 12/3/2025|||
Lol. You don't think that Microsoft has _a_ compelling AI product? The new version of 365 Copilot is objectively compelling, even if it is a work in progress. And Github Copilot is also objectively compelling.
mips_avatar 12/4/2025||
I don’t think anyone would choose GitHub copilot over Cursor
hexator 12/3/2025||
Moving to the Cloud proved to be a pretty nice moneymaker far faster and more concretely than AI has been for these companies. It's a fair comparison regarding corporate pushes but not anything more than that.
More comments...