The Bitter Prediction

Posted by jannesan 4/12/2025

215 points | 178 commentspage 2

xg15 4/13/2025|

A question that came up in discussions recently and that I found interesting: How will new APIs, libraries or tooling be introduced in the future?

The models all have their specific innate knowledge of the programming ecosystem from the point in time where their last training data was collected. However, unlike humans, they cannot update that knowledge unless a new finetuning is performed - and even then, they can only learn about new libraries that are already in widespread use.

So if everyone now shifts to Vibe Coding, will this now mean that software ecosystems effectively become frozen? New libraries cannot gain popularity because AIs won't use them in code and AIs won't start to use them because they aren't popular.

benoau 4/13/2025||

I guess the counter-question is does it matter if nobody is building tools optimized for humans, when humans aren't being paid to write software?

I saw a submission earlier today that really illustrated perfectly why AI is eating people who write code:

> You could spend a day debating your architecture: slices, layers, shapes, vegetables, or smalltalk. You could spend several days eliminating the biggest risks by building proofs-of-concept to eliminate unknowns. You could spend a week figuring out how you’ll store, search, and cache data and which third–party integrations you’ll need.

$5k/person/week to have an informed opinion of how to store your data! AI going to look at the billion times we already asked these questions and make an instant decision and the really, really important part is it doesn't really matter what we choose anyway because there are dozens of right answers.

mckn1ght 4/13/2025|||

There will still be people who care to go deeper and learn what an API is and how to design a good one. They will be able to build the services and clients faster and go deeper using AI code assistants.

And then, yes, you’ll have the legions of vibe coders living in Plato’s cave and churning out tinker toys.

fragmede 4/13/2025||

That’s it then isn’t it? We are at the level where we’re making tinker toys. What is the tinker toy industry like? Instead of expensive start up Google office. Do I at least get a workshop in the back of the garden? How much does it pay?

mckn1ght 4/15/2025||

I mean yeah a lot of tech is doing is tinker toy BS. A lot of people in it to make money, not make the world better in any material way. To some extent that’s fine, but some people become deluded.

There are still real things being done, but they often don’t pay as nicely or live in the spotlight.

mike_hearn 4/13/2025|||

It's not an issue. Claude routinely uses internal APIs and frameworks on one of my projects that aren't public. The context windows are big enough now that it can learn from a mix of summarized docs and surrounding examples and get it nearly right, nearly all the time.

There is an interesting aspect to this whereby there's maybe more incentive to open source stuff now just to get usage examples in the training set. But if context windows keep expanding it may also just not matter.

The trick is to have good docs. If you don't then step one is to work with the model to write some. It can then write its own summaries based on what it found 'surprising' and those can be loaded into the context when needed.

c7b 4/13/2025|||

Not sure this is going to be a big issue practice. Tools like ChatGPT regularly get new knowledge cutoffs and those seem to work well in my experience. I haven't tested it with programming features specifically, but you could simply do a small experiment: take the tool of your choice and a programming feature that was introduced after it first launched and see whether you can get it to use it correctly.

fragmede 4/13/2025||

> unless a new finetuning is performed

That's where we're at. The LLM needs to be told about the brand new API by feeding it new docs, which just uses up tokens in its context window.

zkmon 4/13/2025||

It's not true that coding would no longer be fun because of AI. Arithmetic did not stop being fun because of calculators. Travel did not stop being fun because of cars and planes. Life did not stop being fun because of lack of old challenges.

New challenges would come up. If calculators made the arithmetic easy, math challenges move to next higher level. If AI does all the thinking and creativity, human would move to next level. That level could be some menial work which AI can't touch. For example, navigating the complexities of legacy systems and workflows and human interactions needed to keep things working.

fire_lake 4/13/2025||

> For example, navigating the complexities of legacy systems and workflows and human interactions needed to keep things working.

Well this sounds delightful! Glad to be free of the thinking and creativity!

mckn1ght 4/13/2025||

When you’re churning out many times more code per unit time, you had better think good and hard about how to organize it.

Everyone wanted to be an architect. Well, here’s our chance!

wizzwizz4 4/13/2025|||

I find legacy systems fun because you're looking at an artefact built over the years by people. I can get a lot of insight into how a system's design and requirements changed over time, by studying legacy code. All of that will be lost, drowned in machine-generated slop, if next decade's legacy code comes out the backside of a language model.

ThrowawayR2 4/13/2025|||

> "All of that will be lost, drowned in machine-generated slop, if next decade's legacy code comes out the backside of a language model."

The fun part though is that future coding LLMs will eventually be poisoned by ingesting past LLM generated slop code if unrestricted. The most valuable code bases to improve LLM quality in the future will be the ones written by humans with high quality coding skills that are not reliant or minimally reliant on LLMs, making the humans who write them more valuable.

Think about it: A new, even better programming language is created like Sapphire on Skates or whatever. How does a LLM know how to output high quality idiomatically correct code for that hot new language? The answer is that _it doesn't_. Not until 1) somebody writes good code for that language for the LLM to absorb and 2) in a large enough quantity for patterns to emerge that the LLM can reliably identify as idiomatic.

It'll be pretty much like the end of Asimov's "Feeling of Power" (https://en.wikipedia.org/wiki/The_Feeling_of_Power) or his almost exactly LLM relevant novella "Profession" ( https://en.wikipedia.org/wiki/Profession_(novella) ).

eMPee584 4/13/2025||||

thanks to git repositories stored away in arctic tunnels our common legacy code heritage might outlast most other human artifacts.. (unless ASI choses to erase that of course)

mckn1ght 4/13/2025|||

That’s fine if you find that fun, but legacy archeology is a means to an end, not an end itself.

wizzwizz4 4/13/2025||

Legacy archaeology in a 60MiB codebase far easier than digging through email archives, requirements docs, and old PowerPoint files that Microsoft Office won't even open properly any more (though LibreOffice can, if you're lucky). Handwritten code actually expresses something about the requirements and design decisions, whereas AI slop buries that signal in so much noise and makes "archaeology" almost impossible.

When insight from a long-departed dev is needed right now to explain why these rules work in this precise order, but fail when the order is changed, do you have time to git bisect to get an approximate date, then start trawling through chat logs in the hopes you'll happen to find an explanation?

mckn1ght 4/13/2025||

Code is code, yes it can be more or less spaghetti but if it compiles at all, it can be refactored.

Having to dig through all that other crap is unfortunate. Ideally you have tests that encapsulate the specs, which are then also code. And help with said refactors.

wizzwizz4 4/13/2025||

We had enough tests to know that no other rule configuration worked. Heck, we had mathematical proof (and a small pile of other documentation too obsolete or cryptic to be of use), and still, the only thing that saved the project was noticing different stylistic conventions in different parts of the source, allowing the minor monolith to be broken down into "this is the core logic" and "these are the parts of a separate feature that had to be weaved into the core logic to avoid a circular dependency somewhere else", and finally letting us see enough of the design to make some sense out of the cryptic documentation. (Turns out the XML held metadata auxiliary to the core logic, but vital to the higher-level interactive system, the proprietary binary encoding was largely a compression scheme to avoid slowing down the core logic, and the system was actually 8-bit-clean from the start – but used its own character encoding instead of UTF-8, because it used to talk to systems that weren't.)

Test-driven development doesn't actually work. No paradigm does. Fundamentally, it all boils down to communication: and generative AI systems essentially strip away all the "non-verbal" communication channels, replacing them with the subtext equivalent of line noise. I have yet to work with anyone good enough at communicating that I can do without the side-channels.

Ekaros 4/13/2025|||

Makes me think that the actual horrific solution here is that every single prompt and output ever made while developing must be logged and stored. As that might be only documentation that exist for what was made.

Actually really thinking, if I was running company allowing or promoting AI use that would be first priority. Whatever is prompted, must be stored forever.

mckn1ght 4/13/2025|||

> generative AI systems essentially strip away all the "non-verbal" communication channels

This is a human problem, not a technological one.

You can still have all your aforementioned broken powerpoints etc and use AI to help write code you would’ve previously written simply by hand.

If your processes are broken enough to create unmaintainable software, they will do so regardless of how code pops into existence. AI just speeds it up either way.

wizzwizz4 4/13/2025||

The software wasn't unmaintainable. The PowerPoints etc were artefacts of a time when everyone involved understood some implicit context, within which the documentation was clear (not cryptic) and current (not obsolete). The only traces of that context we had, outside the documentation, were minor decisions made while writing the program: "what mindset makes this choice more likely?", "in what directions was this originally designed to extend?", etc.

Personally, I'm in the "you shouldn't leave vital context implicit" camp; but in this case, the software was originally written by "if I don't already have a doctorate, I need only request one" domain experts, and you would need an entire book to provide that context. We actually had a half-finished attempt – 12 names on the title page, a little over 200 pages long – and it helped, but chapter 3 was an introduction-for-people-who-already-know-the-topic (somehow more obscure than the context-free PowerPoints, though at least it helped us decode those), chapter 4 just had "TODO" on every chapter heading, and chapter 5 got almost to the bits we needed before trailing off with "TODO: this is hard to explain because" notes. (We're pretty sure they discussed this in more detail over email, but we didn't find it. Frankly, it's lucky we have the half-finished book at all.)

AI slop lacks this context. If the software had been written using genAI, there wouldn't have been the stylistic consistency to tell us we were on the right track. There wouldn't have been the conspicuous gap in naming, elevating "the current system didn't need that helper function, so they never wrote it" to a favoured hypothesis, allowing us to identify the only possible meaning of one of the words in chapter 3, and thereby learn why one of those rules we were investigating was chosen. (The helper function would've been meaningless at the time, although it does mean something in the context of a newer abstraction.) We wouldn't have been able to used a piece of debugging code from chapter 6 (modified to take advantage of the newer debug interface) to walk through the various data structures, guessing at which parts meant what using the abductive heuristic "we know it's designed deliberately, so any bits that appear redundant probably encode a meaning we don't yet understand".

I am very glad this system was written by humans. Sure, maybe the software would've been written faster (though I doubt it), but we wouldn't have been able to understand it after-the-fact. So we'd have had to throw it away, rediscover the basic principles, and then rewrite more-or-less the same software again – probably with errors. I would bet a large portion of my savings that that monstrosity is correct – that if it doesn't crash, it will produce the correct output – and I wouldn't be willing to bet that on anything we threw together as a replacement. (Yes, I want to rewrite the thing, but that's not a reasoned decision based on the software: it's a character trait.)

mckn1ght 4/13/2025||

I guess I just categorically disagree that a codebase is impossible to understand without “sufficient” additional context. And I think you ascribe too much order to software written by humans that can exist in quite varied groups wrt ability, experience, style, and care.

wizzwizz4 4/13/2025||

It was easy to understand what the code was instructing the computer to do. It was harder to understand what that meant, why it was happening, and how to change it.

A program to calculate payroll might be easy to understand, but unless you understand enough about finance and tax law, you can't successfully modify it. Same with an audio processing pipeline: you know it's doing something with Fourier transforms, because that's what the variable names say, but try to tweak those numbers and you'll probably destroy the sound quality. Or a pseudo-random number generator: modify that without understanding how it works, and even if your change feels better, you might completely break it. (See https://roadrunnerwmc.github.io/blog/2020/05/08/nsmb-rng.htm..., or https://redirect.invidious.io/watch?v=NUPpvoFdiUQ if you want a few more clips.)

I've worked with codebases written by people with varying skillsets, and the only occasions where I've been confused by the subtext have been when the code was plagiarised.

wizzwizz4 4/25/2025||

Marcus Müller gives a good explanation in a comment (CC BY-SA 4.0) on Stack Exchange: https://dsp.stackexchange.com/posts/comments/204371

> [The] problem is that the sole medium of transport here for the intent of what the user wanted the language model to write and what we see is the output of the language model. And that in itself is a bit of a problem: had we hand-written code, we could look at what it tries to do; it would have suggestive names, maybe even comments, stemming from the user's original intent, and not from an LLM's interpretation of what the user told it was their intent. Basically, LLMs are intent-obfuscating machines for engineering problems :)

keybored 4/13/2025||

> New challenges would come up. If calculators made the arithmetic easy, math challenges move to next higher level. If AI does all the thinking and creativity, human would move to next level. That level could be some menial work which AI can't touch. For example, navigating the complexities of legacy systems and workflows and human interactions needed to keep things working.

You’re gonna work on captcha puzzles and you’re gonna like it.

IshKebab 4/12/2025||

> Not only that, the generated code was high-quality, efficient, and conformed to my coding guidelines. It routinely "checked its work" by running unit tests to eliminate hallucinations and bugs.

This seems completely out of whack with my experience of AI coding. I'm definitely in the "it's extremely useful" camp but there's no way I would describe its code as high quality and efficient. It can do simple tasks but it often gets things just completely wrong, or takes a noob-level approach (e.g. O(N) instead of O(1)).

Is there some trick to this that I don't know? Because personally I would love it if AI could do some of the grunt work for me. I do enjoy programming but not all programming.

joelthelion 4/12/2025|

Which model and tool are you using? There's a whole spectrum of AI-assisted coding.

IshKebab 4/12/2025||

ChatGPT, Claude (both through the website), and Github Copilot (paid if it makes any difference).

qingcharles 4/13/2025|||

I use the same with a sprinkling of Gemini 2.5 and Grok3.

I find it they all make errors, but 95% of them I spot immediately by eye and either correct manually or reroll through prompting.

The error rate has gone down in the last 6 months, though, and the efficiency of the C# code I mostly generate has gone up by an order of magnitude. I would rarely produce code that is more efficient than what AI produces now. (I have a prompt though that tells it to use all the latest platform advances and to search the web first for the latest updates that will increase the efficiency and size of the code)

joelthelion 4/12/2025|||

Try Aider with Gemini 2.5.

frognumber 4/12/2025||

I may be old, but I had the same feeling for low-level code. I enjoyed doing things like optimizing a low-level loop in C or assembly, bootstrapping a microcontroller, or writing code for a processor which didn't have a compiler yet. Even in BASIC, I enjoyed PEEKing and POKE'ing. I enjoyed opening up a file system in a binary editor. I enjoyed optimizing how my computer draws a line.

All this went away. I felt a loss of joy and nostalgia for it. It was bitter.

Not bad, but bitter.

whiplash451 4/13/2025||

The author is doing the math the wrong way. For an extra $5/day, a 3rd world country can now pay an engineer $20/day to do the job of a junior engineer in a 1st world one.

The bitter lesson is going to be for junior engineers who see less job offers and don’t see consulting power houses eat their lunch.

inerte 4/13/2025|

Yes, my thoughts at the end of the article. If the AI coding is really good (or will be really, really good) you could give 6 figures salary + $5/d in OpenAI credits to a Bay Area developer, OR you give $5/d salary + $5/d in OpenAI credits to someone else from another country.

That's what happened to manufacturing after all.

owebmaster 4/15/2025|||

150 dollars/month as salary won't get you no one from no country and if it happens to, the person will have so many things to figure out (war, hunger, political instability) that they would obviously not be productive.

fragmede 4/13/2025|||

Thing is, manufacturing physical goods mean you have to physically move them around. Digital goods don't have that problem. Timezones are what's proving to be challenging though.

whiplash451 4/14/2025||

100%. You can offshore "please write code doing X for me" but it's much harder to offshore "please generate value for my customers with this codebase" which is a lot closer to what software engineers actually do.

Therefore, I do not anticipate a massive offshoring of software like what happened in manufacturing. Yet, a lot of software work can be fully specified and will be outsourced.

palata 4/13/2025||

I tend to think about the average code review: who actually catches tricky bugs? Who actually takes the time to fully understand the code they review? And who likes it? My feeling is that reviews are generally a "skimming through the code and checking that it looks ok from a distance".

At least we have one person who understands it in details: the one who wrote it.

But with AI-generated code, it feels like nobody writes it anymore: everybody reviews. Not only we don't like to review, but we don't do it well. And if you want to review it thoroughly, you may as well write it. Many open source maintainers will tell you that many times, it's faster for them to write the code than to review a PR from a stranger they don't trust.

M4v3R 4/12/2025||

To me it’s the exact opposite. I was writing code for the past 20+ years and I recently realized it’s not the act of writing code I love, but the act of creating something from nothing. Over the past few months I wrote two non-trivial utility apps that otherwise I would most probably not write because I didn’t have enough time to do that, but Cursor + Claude gave me the 5x productivity boost that enabled me to do so, and I really enjoyed doing that.

My only gripe is that the models are still pretty slow, and that discourages iteration and experimentation. I can’t wait for the day a Claude 3.5 grade model with 1000 tok/s speed releases, this will be a total game changer for me. Gemini 2.5 recently came closer, but it’s still not there.

float4 4/12/2025||

For me it's a bit of both. I'm working on exciting energy software with people who have deep knowledge of the sector but only semi-decent software knowledge. Nearly every day I'm reviewing some shitty PR comprised of awful, ugly code that somehow mostly works.

The product itself is exciting and solves a very real problem, and we have many customers who want to use it and pay for it. But damn, it hurts my soul knowing what goes on under the hood.

selimthegrim 4/13/2025||

Are you guys hiring by any chance?

nu11ptr 4/12/2025|||

I've kinda hit the same place. I thought I loved writing code, but I so often start projects and don't finish once the excitement of writing all the code wears off. I'm realizing it is designing and architecting that I love, and seeing that get built, not writing every line of code. I also am enjoying AI as my velocity has solidly improved.

Another area I find very helpful is when I need to use the same technique in my code as someone from another language. No longer do I need to spend hours figuring out how they did it. I just ask an AI and have them explain it to me and then often simply translate the code.

hsuduebc2 4/12/2025||

Same here. I do not usually enjoy programming as an craft but the act of building something is what is loveable experience.

skerit 4/12/2025||

The challenge I often face is having an entire _mental model_ of what I want to build already crystallized in my head, but then the realization that it will take hours of coding to actually convert that to code... That can be incredibly demotivating.

qingcharles 4/13/2025||

Exactly. It's even hard to get started sometimes.

AI coding has removed the drudgery for me. It made coding 10X more enjoyable.

AndrewKemendo 4/13/2025||

I had a conversation with a fellow tech founder (Running a $Bn+ val Series D robotics company currently) recently on AI assisted coding tools.

We have both been using or integrating AI code support tools since they became available and both writing code (usually Python) for 20+ years.

We both agree that windsurf + claude is our default IDE/Env now on. We also agree that for all future projects we think we can likely cut the number of engineers needed by 1/3rd.

Based on what I’ve been using for the last year professionally (copilot) and on the side, I’m confident I could build faster, better and with less effort with 5 engineers and AI tools as with 10 or 15. Also communication overhead reduces by 3x which prevents slowdowns.

So if I have a HA 5 layer stack application (fe, be, analytics, train/inference, networking/data mgt) with IPCs between them, instead of one senior and two juniors per process for a total of 15 people, I only need the 5 mid-seniors now.

cardanome 4/12/2025||

A relative known youtuber called the primeagen has recently done a challenge sponsored by Cursor themselves where he and some friends would "vibe code" a game in a week. The results were pretty underwhelming. They would have been much faster not using generative Ai.

Compared what you see from game jams where sometimes solo devs create whole games in just a few days it was pretty trash.

It also tracks with my own experience. Yes, cursor quickly helps me get the first 80% done but then I spent so much time cleaning after it that I have barely saved any time in total.

For personal projects where you don't care about code quality I can see it as a great tool. If you actual have professional standards, no. (Except maybe for unit tests, I hate writing those by hand.)

Most of the current limitation CAN be solved by throwing even more compute at it. Absolutely. The question is will it economically make sense? Maybe if fusion becomes viable some day but currently with the end of fossil fuels and climate change? Is generative Ai worth destroying our planet for?

At some point the energy consumption of generative AI might get so high and expensive that you might be better off just letting humans do the work.

sigmoid10 4/12/2025||

I feel most people drastically underestimate game dev. The programming aspect is only one tiny part of it and even there it goes so wide (from in-game logic to rendering to physics) that it's near impossible for people who are not really deep into it to have a clue what is happening. And even if you manage to vibe-code your way through it, your game will still suck unless you have good assets - which means textures, models, animations, sounds, FX... you get it. Developing a high quality game is sort of the ultimate test for AI and if it achieves it on a scale beyond game jams we might as well accept that we have reached artificial superintelligence.

dinfinity 4/12/2025|||

To be fair, the whole "vibe coding" thing is really really new stuff. It will undoubtedly take some time to optimize how to actually effectively do it.

Recently, we've seen a lot of a shift in insight into not just diving straight into implementation, but actually spending time on careful specification, discussion and documentation either with or without an AI assistant before setting it loose to implement stuff.

For large, existing codebases, I sincerely believe that the biggest improvements lie in using MCP and proper instructions to connect the AI assistants to spec and documentation. For new projects I would put pretty much all of that directly into the repos.

nyarlathotep_ 4/14/2025|||

> A relative known youtuber called the primeagen has recently done a challenge sponsored by Cursor themselves where he and some friends would "vibe code" a game in a week. The results were pretty underwhelming. They would have been much faster not using generative Ai.

I ended up watching maybe 10 minutes of these streams on two separate occasions, and he was writing code manually 90% of the time on both occasions, or yelling at LLM output.

BlackLotus89 4/12/2025||

[flagged]

cardanome 4/12/2025||

I used it as an example because the event was sponsored by Cursor so I figured they had an interest in making the product look good. And they really failed at this.

The again primeagen is pretty critical of vibe coding so it was super weird match up anyway. I guess they decided to just have some fun. Maybe advertise the vibe coding "lifestyle" more so than the technical merit of the product.

Oh, it isn't the usual content for primeagen. He mostly reacts to other technical videos and articles and rants about his love for neovim and ziglang. He has ok takes most of the time and is actually critical of the overuse of generative Ai. But yeah, he is not a technical deep dive youtuber but more for entertainment.

jstummbillig 4/12/2025|

I don't really see it. At least the article should address why we would not assume massive price drops, market adjusted pricing and free offerings, as with all other innovation before, that all lead to wider access to better technology.

Why would this be the exception?

ignoramous 4/12/2025|

If that happens, I can see those programmers become their age's Uber drivers (low pay, low skill, unsatisfactory, gig workforce).

More comments...