The models all have their specific innate knowledge of the programming ecosystem from the point in time where their last training data was collected. However, unlike humans, they cannot update that knowledge unless a new finetuning is performed - and even then, they can only learn about new libraries that are already in widespread use.
So if everyone now shifts to Vibe Coding, will this now mean that software ecosystems effectively become frozen? New libraries cannot gain popularity because AIs won't use them in code and AIs won't start to use them because they aren't popular.
I saw a submission earlier today that really illustrated perfectly why AI is eating people who write code:
> You could spend a day debating your architecture: slices, layers, shapes, vegetables, or smalltalk. You could spend several days eliminating the biggest risks by building proofs-of-concept to eliminate unknowns. You could spend a week figuring out how you’ll store, search, and cache data and which third–party integrations you’ll need.
$5k/person/week to have an informed opinion of how to store your data! AI going to look at the billion times we already asked these questions and make an instant decision and the really, really important part is it doesn't really matter what we choose anyway because there are dozens of right answers.
And then, yes, you’ll have the legions of vibe coders living in Plato’s cave and churning out tinker toys.
There are still real things being done, but they often don’t pay as nicely or live in the spotlight.
There is an interesting aspect to this whereby there's maybe more incentive to open source stuff now just to get usage examples in the training set. But if context windows keep expanding it may also just not matter.
The trick is to have good docs. If you don't then step one is to work with the model to write some. It can then write its own summaries based on what it found 'surprising' and those can be loaded into the context when needed.
That's where we're at. The LLM needs to be told about the brand new API by feeding it new docs, which just uses up tokens in its context window.
New challenges would come up. If calculators made the arithmetic easy, math challenges move to next higher level. If AI does all the thinking and creativity, human would move to next level. That level could be some menial work which AI can't touch. For example, navigating the complexities of legacy systems and workflows and human interactions needed to keep things working.
Well this sounds delightful! Glad to be free of the thinking and creativity!
Everyone wanted to be an architect. Well, here’s our chance!
The fun part though is that future coding LLMs will eventually be poisoned by ingesting past LLM generated slop code if unrestricted. The most valuable code bases to improve LLM quality in the future will be the ones written by humans with high quality coding skills that are not reliant or minimally reliant on LLMs, making the humans who write them more valuable.
Think about it: A new, even better programming language is created like Sapphire on Skates or whatever. How does a LLM know how to output high quality idiomatically correct code for that hot new language? The answer is that _it doesn't_. Not until 1) somebody writes good code for that language for the LLM to absorb and 2) in a large enough quantity for patterns to emerge that the LLM can reliably identify as idiomatic.
It'll be pretty much like the end of Asimov's "Feeling of Power" (https://en.wikipedia.org/wiki/The_Feeling_of_Power) or his almost exactly LLM relevant novella "Profession" ( https://en.wikipedia.org/wiki/Profession_(novella) ).
When insight from a long-departed dev is needed right now to explain why these rules work in this precise order, but fail when the order is changed, do you have time to git bisect to get an approximate date, then start trawling through chat logs in the hopes you'll happen to find an explanation?
Having to dig through all that other crap is unfortunate. Ideally you have tests that encapsulate the specs, which are then also code. And help with said refactors.
Test-driven development doesn't actually work. No paradigm does. Fundamentally, it all boils down to communication: and generative AI systems essentially strip away all the "non-verbal" communication channels, replacing them with the subtext equivalent of line noise. I have yet to work with anyone good enough at communicating that I can do without the side-channels.
Actually really thinking, if I was running company allowing or promoting AI use that would be first priority. Whatever is prompted, must be stored forever.
This is a human problem, not a technological one.
You can still have all your aforementioned broken powerpoints etc and use AI to help write code you would’ve previously written simply by hand.
If your processes are broken enough to create unmaintainable software, they will do so regardless of how code pops into existence. AI just speeds it up either way.
Personally, I'm in the "you shouldn't leave vital context implicit" camp; but in this case, the software was originally written by "if I don't already have a doctorate, I need only request one" domain experts, and you would need an entire book to provide that context. We actually had a half-finished attempt – 12 names on the title page, a little over 200 pages long – and it helped, but chapter 3 was an introduction-for-people-who-already-know-the-topic (somehow more obscure than the context-free PowerPoints, though at least it helped us decode those), chapter 4 just had "TODO" on every chapter heading, and chapter 5 got almost to the bits we needed before trailing off with "TODO: this is hard to explain because" notes. (We're pretty sure they discussed this in more detail over email, but we didn't find it. Frankly, it's lucky we have the half-finished book at all.)
AI slop lacks this context. If the software had been written using genAI, there wouldn't have been the stylistic consistency to tell us we were on the right track. There wouldn't have been the conspicuous gap in naming, elevating "the current system didn't need that helper function, so they never wrote it" to a favoured hypothesis, allowing us to identify the only possible meaning of one of the words in chapter 3, and thereby learn why one of those rules we were investigating was chosen. (The helper function would've been meaningless at the time, although it does mean something in the context of a newer abstraction.) We wouldn't have been able to used a piece of debugging code from chapter 6 (modified to take advantage of the newer debug interface) to walk through the various data structures, guessing at which parts meant what using the abductive heuristic "we know it's designed deliberately, so any bits that appear redundant probably encode a meaning we don't yet understand".
I am very glad this system was written by humans. Sure, maybe the software would've been written faster (though I doubt it), but we wouldn't have been able to understand it after-the-fact. So we'd have had to throw it away, rediscover the basic principles, and then rewrite more-or-less the same software again – probably with errors. I would bet a large portion of my savings that that monstrosity is correct – that if it doesn't crash, it will produce the correct output – and I wouldn't be willing to bet that on anything we threw together as a replacement. (Yes, I want to rewrite the thing, but that's not a reasoned decision based on the software: it's a character trait.)
A program to calculate payroll might be easy to understand, but unless you understand enough about finance and tax law, you can't successfully modify it. Same with an audio processing pipeline: you know it's doing something with Fourier transforms, because that's what the variable names say, but try to tweak those numbers and you'll probably destroy the sound quality. Or a pseudo-random number generator: modify that without understanding how it works, and even if your change feels better, you might completely break it. (See https://roadrunnerwmc.github.io/blog/2020/05/08/nsmb-rng.htm..., or https://redirect.invidious.io/watch?v=NUPpvoFdiUQ if you want a few more clips.)
I've worked with codebases written by people with varying skillsets, and the only occasions where I've been confused by the subtext have been when the code was plagiarised.
You’re gonna work on captcha puzzles and you’re gonna like it.
This seems completely out of whack with my experience of AI coding. I'm definitely in the "it's extremely useful" camp but there's no way I would describe its code as high quality and efficient. It can do simple tasks but it often gets things just completely wrong, or takes a noob-level approach (e.g. O(N) instead of O(1)).
Is there some trick to this that I don't know? Because personally I would love it if AI could do some of the grunt work for me. I do enjoy programming but not all programming.
I find it they all make errors, but 95% of them I spot immediately by eye and either correct manually or reroll through prompting.
The error rate has gone down in the last 6 months, though, and the efficiency of the C# code I mostly generate has gone up by an order of magnitude. I would rarely produce code that is more efficient than what AI produces now. (I have a prompt though that tells it to use all the latest platform advances and to search the web first for the latest updates that will increase the efficiency and size of the code)
All this went away. I felt a loss of joy and nostalgia for it. It was bitter.
Not bad, but bitter.
The bitter lesson is going to be for junior engineers who see less job offers and don’t see consulting power houses eat their lunch.
That's what happened to manufacturing after all.
Therefore, I do not anticipate a massive offshoring of software like what happened in manufacturing. Yet, a lot of software work can be fully specified and will be outsourced.
At least we have one person who understands it in details: the one who wrote it.
But with AI-generated code, it feels like nobody writes it anymore: everybody reviews. Not only we don't like to review, but we don't do it well. And if you want to review it thoroughly, you may as well write it. Many open source maintainers will tell you that many times, it's faster for them to write the code than to review a PR from a stranger they don't trust.
My only gripe is that the models are still pretty slow, and that discourages iteration and experimentation. I can’t wait for the day a Claude 3.5 grade model with 1000 tok/s speed releases, this will be a total game changer for me. Gemini 2.5 recently came closer, but it’s still not there.
The product itself is exciting and solves a very real problem, and we have many customers who want to use it and pay for it. But damn, it hurts my soul knowing what goes on under the hood.
Another area I find very helpful is when I need to use the same technique in my code as someone from another language. No longer do I need to spend hours figuring out how they did it. I just ask an AI and have them explain it to me and then often simply translate the code.
AI coding has removed the drudgery for me. It made coding 10X more enjoyable.
We have both been using or integrating AI code support tools since they became available and both writing code (usually Python) for 20+ years.
We both agree that windsurf + claude is our default IDE/Env now on. We also agree that for all future projects we think we can likely cut the number of engineers needed by 1/3rd.
Based on what I’ve been using for the last year professionally (copilot) and on the side, I’m confident I could build faster, better and with less effort with 5 engineers and AI tools as with 10 or 15. Also communication overhead reduces by 3x which prevents slowdowns.
So if I have a HA 5 layer stack application (fe, be, analytics, train/inference, networking/data mgt) with IPCs between them, instead of one senior and two juniors per process for a total of 15 people, I only need the 5 mid-seniors now.
Compared what you see from game jams where sometimes solo devs create whole games in just a few days it was pretty trash.
It also tracks with my own experience. Yes, cursor quickly helps me get the first 80% done but then I spent so much time cleaning after it that I have barely saved any time in total.
For personal projects where you don't care about code quality I can see it as a great tool. If you actual have professional standards, no. (Except maybe for unit tests, I hate writing those by hand.)
Most of the current limitation CAN be solved by throwing even more compute at it. Absolutely. The question is will it economically make sense? Maybe if fusion becomes viable some day but currently with the end of fossil fuels and climate change? Is generative Ai worth destroying our planet for?
At some point the energy consumption of generative AI might get so high and expensive that you might be better off just letting humans do the work.
Recently, we've seen a lot of a shift in insight into not just diving straight into implementation, but actually spending time on careful specification, discussion and documentation either with or without an AI assistant before setting it loose to implement stuff.
For large, existing codebases, I sincerely believe that the biggest improvements lie in using MCP and proper instructions to connect the AI assistants to spec and documentation. For new projects I would put pretty much all of that directly into the repos.
I ended up watching maybe 10 minutes of these streams on two separate occasions, and he was writing code manually 90% of the time on both occasions, or yelling at LLM output.
The again primeagen is pretty critical of vibe coding so it was super weird match up anyway. I guess they decided to just have some fun. Maybe advertise the vibe coding "lifestyle" more so than the technical merit of the product.
Oh, it isn't the usual content for primeagen. He mostly reacts to other technical videos and articles and rants about his love for neovim and ziglang. He has ok takes most of the time and is actually critical of the overuse of generative Ai. But yeah, he is not a technical deep dive youtuber but more for entertainment.
Why would this be the exception?