Posted by ColinWright 23 hours ago
I mourn having to repeatedly hear this never-quite-true promise that an amazing future of perfect code from agentic whatevers will come to fruition, and it's still just six months away. "Oh yes, we know we said it was coming six, twelve, and eighteen months ago, but this time we pinky swear it's just six months away!"
I remember when I first got access to the internet. It was revolutionary. I wanted to be online all the time, playing games, chatting with friends, and discovering new things. It shaped my desire to study computer science and learn to develop software! I could see and experience the value of the internet immediately. It's utility was never "six months away," and I didn't have to be compelled to use it—I was eager to use it of my own volition as often as possible.
LLM coding doesn't feel revolutionary or exciting like this. It's a mandate from the top. It's my know-nothing boss telling me to "find ways to use AI so we can move faster." It's my boss's know-nothing boss conducting Culture Amp surveys about AI usage, but ignoring the feedback that 95% of Copilot's PR comments are useless noise: "The name of this unit test could be improved." It's waiting for code to be slopped onto my screen, so I can go over it with a fine-toothed comb and find all the bugs—and there are always bugs.
Here's what I hope is six months away: The death of AI hype.
What's much more interesting is looking back 6, 12, 18, or 24 months. 6 months ago was ChatGPT 5, 12 months ago was GPT 4.5, 18 months ago was 4o, and 24 months ago ChatGPT 3.5 was released (the first one). If you've been following closely you'll have seen incredible changes between each of them. Not to get to perfect, because that's not really a reasonable goal, but definite big leaps forward each time. A couple of years ago one-shotting a basic tic tac toe wasn't really possible. Now though, you can one-shot a fairly complex web app. It won't be perfect, or even good by a lot of measures compared to human written software, but it will work.
I think the comparison to the internet is a good one. I wrote my first website in 1997, and saw the rapid iteration of websites and browsers back then. It felt amazing, and fast. AI feels the same to me. But given the fact that browsers still aren't good in a lot of ways I think it's fair to say AI will take a similarly long time. That doesn't mean the innovations along the way aren't freaking cool though.
It's pretty obvious the change of pace is slowing down and there isn't a lot of evidence that shipping a better harness and post-training on using said harness is going to get us to the magical place where all SWE is automated that all these CEOs have promised.
What's happening now is training models for long-running tasks that use tools, taking hours at a time. The latest models like 4.6 and 5.3 are starting to make good on this. If you're not using models that are wired into tools and allowed to iterate for a while, then you're not getting to see the current frontier of abilities.
(EG if you're just using models to do general knowledge Q&A, then sure, there's only so much better you can get at that and models tapered off there long ago. But the vision is to use agents to perform a substantial fraction of white-collar work, there are well-defined research programmes to get there, and there is stead progress.)
o1 was something like 16-18 months ago. o3 was kinda better, and GPT 5 was considered a flop because it was basically just o3 again.
I’ve used all the latest models in tools like Claude code and codex, and I guess I’m just not seeing the improvement? I’m not even working on anything particularly technically complex, but I still have to constantly babysit these things.
Where are the long-running tasks? Cursor’s browser that didn’t even compile? Claude’s C compiler that had gcc as an oracle and still performs worse than gcc without any optimizations? Yeah I’m completely unimpressed at this point given the promises these people have been making for years now. I’m not surprised that given enough constraints they can kinda sorta dump out some code that resembles something else in their training data.
I am not claiming it's perfect, or even particularly good at some tasks (pelicans on bicycles for example), but anyone claiming it isn't a mind-blowing achievement in a staggeringly short time is just kidding themselves. It is.
The job market will get flooded with the unemployed (it already is) with fewer jobs to replace the ones that were automated, those remaining jobs will get reduced to minimum wages whenever and wherever possible. 25% of new college grads cannot find employment. Soon young people will be so poor that you'll beg to fight in a war. Give it 5-10 years.
This isn't a hard future to game theory out, its not pretty if we maintain this fast track of progress in ML that minimally requires humans. Notice how the ruling class has increased the salaries for certain types of ML engineers, they know what's at stake. These businessmen make decisions based on expected value calculated from complex models, they aren't giving billion dollar pay packages to engineers because its trendy. We should use our own mental models to predict where this is going, and prevent it from happening however possible.
THE word ''Luddite'' continues to be applied with contempt to anyone with doubts about technology, especially the nuclear kind. Luddites today are no longer faced with human factory owners and vulnerable machines. As well-known President and unintentional Luddite D. D. Eisenhower prophesied when he left office, there is now a permanent power establishment of admirals, generals and corporate CEO's, up against whom us average poor bastards are completely outclassed, although Ike didn't put it quite that way. We are all supposed to keep tranquil and allow it to go on, even though, because of the data revolution, it becomes every day less possible to fool any of the people any of the time. If our world survives, the next great challenge to watch out for will come - you heard it here first - when the curves of research and development in artificial intelligence, molecular biology and robotics all converge. Oboy. It will be amazing and unpredictable, and even the biggest of brass, let us devoutly hope, are going to be caught flat-footed. It is certainly something for all good Luddites to look forward to if, God willing, we should live so long. Meantime, as Americans, we can take comfort, however minimal and cold, from Lord Byron's mischievously improvised song, in which he, like other observers of the time, saw clear identification between the first Luddites and our own revolutionary origins. It begins:[0]
https://archive.nytimes.com/www.nytimes.com/books/97/05/18/r...
Then next month, of course, latest thing becomes last thing, and suddenly it's again obvious that actually it didn't quite work.
It's like running on a treadmill towards a dangling carrot or something. It's simultaneously always here in front of our faces but also not here in actual hand, obviously.
The tools are good and improving. They work for certain things, some of the time, with various need for manual stewarding in the hands of people who really know what they're doing. This is real.
But it remains an absolutely epic leap from here to the idea that writing code per se is a skill nobody needs any more.
More broadly, I don't even really understand what that could possibly mean on a practical level, as code is just instructions for what the software should do. You can express instructions on a higher level, and tooling keeps making that more and more possible (AI and otherwise), but in the end what does it mean to abstract fully away from the instruction in the detail? It seems really clear that will never be able to result in getting software that does what you want in a precise way rather than some probabilistic approximation which must be continually corrected.
I think the real craft of software such that there is one is constructing systems of deterministic logic flows to make things happen in precisely the way we want them to. Whatever happens to tooling, or what exactly we call code or whatever, that won't change.
> getting software that does what you want
so then we become PMs?
Nobody credible is promising you a perfect future. But, a better future, yes! If you do not see it, then know this. You have your head firmly planted in the sand and are intentionally refusing to see what is coming. You may not like it. You may not want it. But it is coming and you will either have to adapt or become irrelevant.
Does Copilot spit out useless PR comments. 100% yes! Are there tools that are better than Copilot? 100% yes! These tools are not perfect. But even with their imperfections, they are very useful. You have to learn to harness them for their strengths and build processes to address their weaknesses. And yes, all of this requires learning and experimentation. Without that, you will not get good results and you will complain about these tools not being good.
I heard it will be here in six months. I guess I don't have much time to adapt! :)
6 months ago is when my coding became 100% done by AI. The utility already has been there for a while.
>I didn't have to be compelled to use it—I was eager to use it of my own volition as often as possible.
The difference is that you were a kid then with an open mind and now your world view has fixed into a certain way the world works and how things should be done.
Yeah, it's weird. I'm fixated on not having bugs in my code. :)
I have encountered a lot of people say it will be better in six months, and every six months It has been.
I have also seen a few predictions that say 'in a year or two they will be able to do a job completely. I am sceptical, but I would say such claims are rare. Dario Amodei has been about the only prominent voice that I have encountered that puts such abilities on a very short timeframe, and he still points to more than a year.
The practical use of AI has certainly increased a lot in the last six months.
So I guess what I'm asking is more specifics on what you feel was claimed, by whom, and how much did they fall short?
Without that supporting evidence you could just be being annoyed by the failure of claims that exist in your imagination.
Maybe you’re just older.
> it's still just six months away
Reminds me of another "just around the corner" promise...[0]I think it is one thing for the average person to buy into the promises but I've yet to understand why that happens here. Or why that happens within our community of programmers. It is one thing for non-experts to fall for obtuse speculative claims, but it is another for experts. I'm excited for autonomous vehicles, but in 2016 is was laughable to think they're around the corner and only 10 years later does such a feat seem to start looking like it's actually a few years away.
Why do we only evaluate people/claims on their hits and not their misses? It just encourages people to say anything and everything, because eventually one will be right. It's 6 months away because eventually it will actually be 6 months away. But is it 6 months away because it is actually 6 months away or because we want it to be? I thought the vibe coder's motto is "I just care that it works." Honestly, I think that's the problem. Everyone care's about if it works or not and that's the primary concern of all sides of the conversation here. So is it 6 months away because it is 6 months away or is it 6 months away because you've convinced yourself it is 6 months away. You got good reasons for believing that, you got the evidence, but evidence for a claim is meaningless without comparing to evidence that counters the claim.
[0] https://en.wikipedia.org/wiki/List_of_predictions_for_autono...
I’ve been programming since 1984.
OP basically described my current role with scary precision.
I mostly review the AI’s code, fix the plan before it starts, and nudge it in the right direction.
Each new model version needs less nudging — planning, architecture, security, all of it.
There’s an upside.
There’s something addictive about thinking of something and having it materialize within an hour.
I can run faster and farther than I ever could before.
I’ve rediscovered that I just like building things — imagining them and watching them come alive — even if I’m not laying every brick myself anymore.
But the pace is brutal.
My gut tells me this window, where we still get to meaningfully participate in the process, is short.
That part is sad, and I do mourn it quite a bit.
If you think this is just hype, you’re doing it wrong.
Also, don't forget the things that AI makes possible. It's a small accomplishment, but I have a World of Warcraft AddOn that I haven't touched in more than 10 years. Of course now, it is utterly broken. I pointed ChatGPT at my old code and asked it to update it to "retail" WoW, and it did it. And it actually worked. That's kind of amazing.
I wonder if there are some interesting groupings.
So if my corporate overlords will have me talk to the soul-less Claude robot all day long in a Severance-style setting, and fix its stupid bugs, but I get to keep my good salary, then I'll shed a small tear for my craft and get back to it. If not... well, then I'll be shedding a lot more tears ... i guess
I often have one programming project I do myself, on the side, and recently I've been using coding agents. Their average ability is no doubt impressive for what they are. But they also make mistakes that not even a recent CS graduate with no experience would ever make (e.g. I asked the agent for it's guess as to why a test is failing; it suggested it might be due to a race condition with an operation that is started after the failing assertion). As a lead, if someone on the team is capable of making such a mistake even once, then that person can't really code, regardless of their average performance (just as someone who sometimes lands a plane in the wrong airport or even crashes without their being a catastrophich condition outside their control can't really fly regardless of their average performance). "This is more complicated than we though and would take longer than we expected" is something you hear a lot, but "sorry, I got confused" is something you never hear. A report by Anthropic last week said, "Claude will work autonomously to solve whatever problem I give it. So it’s important that the task verifier is nearly perfect, otherwise Claude will solve the wrong problem." Yeah, that's not something a team lead faces. I wish the agent could work like a team of programmers and I would be doing my familiar role of a project lead, but it doesn't.
The models do some things well. I believe that programming is an interesting mix of inductive and deductive thinking (https://pron.github.io/posts/people-dont-write-programs), and the models have the inductive part down. They can certainly understand what a codebase does faster than I can. But their deductive reasoning, especially when it comes to the details, is severely lacking (e.g. I asked the agent to document my code. It very quickly grasped the design and even inferred some important invariants, but when it saw an `assert` in one subroutine it documented it as guarding a certain invariant. The intended invariant was correct, it just wasn't the one the assertion was guarding). So I still (have to) work as a programmer when working with coding assistants, even if in a different way.
I've read about great successes at using coding agents in "serious" software, but what's common to those cases is that the people using the agents (Mitchell Hashimoto, antirez) are experts in the respective codebase. At the other end of the spectrum, people who aren't programmers can get some cool programs done, but I've yet to see anything produced in this way (by a non programmer) that I would call serious software.
I don't know what the future will bring, but at the moment, the craft isn't dead. When AI can really program, i.e. the experience is really like that of a team lead, I don't think that the death of programming would concern us, because once they get to that point, the agents will also likely be able to replace the team lead. And middle management. And the CTO, the CFO, and the CEO, and most of the users.
It gets hard to compare AI to humans. You can ask the AI to do things you would never ask a human to do, like retry 1000 times until it works, or assign 20 agents to the same problem with slightly different prompts. Or re-do the entire thing with different aesthetics.