Posted by danielfalbo 3 days ago
sorry, I say it's folding the laundry. with an aging population, that's the most, if not only, useful thing.
Could not agree more. I myself started 2025 being very skeptical, and finished it very convinced about the usefulness of LLMs for programming. I have also seen multiple colleagues and friends go through the same change of appreciation.
I noticed that for certain task, our productivity can be multiplied by 2 to 4. So hence comes my doubts: are we going to be too many developers / software engineers ? What will happen for the rests of us ?
I assume that other fields (other than software-related) should also benefits from the same productivity boosts. I wonder if our society is ready to accept that people should work less. I think the more likely continuation is that companies will either hire less, or fire more, instead of accepting to pay the same for less hours of human-work.
I propose that we should raise the bar for the quality of software now.
Quality is a risk mitigation strategy; if software is disposable just like cheap manufactured goods most people won't pay for it thinking they can just "build another one". What we don't realise is due to sheer cost of building software we've wanted quality because its too expensive to fix later; AI could change that.
Hoping we invest in quality, more software (which has a price inelastic curve mostly due to scale/high ROI) etc I'm starting to think is just false hope from people in the tech industry that want to be optimistic which generally is in our nature. Tech people understand very little about economics most of the time and how people outside tech (your customers) generally operate. My reflection is mostly I need to pivot out of software; it will be commoditized.
We have to accept in the end that coding/SWE is one of the most disrupted fields from this breed of AI. Disruption unfortunately probably means less jobs overall. The profession is on trend to disrupting and automating itself I think; plan accordingly. I've seen so many articles claiming its great we didn't learn to code now; that's what the AI's have done.
I’m actually curious about this and would love pointers to the folks working in this area. My impression from working with LLMs is there’s definitely a “there” there with regards to intelligence - I find the work showing symbolic representation in the structure of the networks compelling - but the overall behavior of the model seems to lack a certain je ne sais quoi that makes me dubious that they can “cross the divide,” as it were. I’d love to hear from more people that, well, sais quoi, or at least have theories.
It's interesting that Terrence Tao just released his own blog post stating that they're best viewed as stochastic generators. True he's not an AI researcher, but it does sound like he's using AI frequently with some success.
"viewing the current generation of such tools primarily as a stochastic generator of sometimes clever - and often useful - thoughts and outputs may be a more productive perspective when trying to use them to solve difficult problems" [0].
(And, in some cases, a desire to deny the people and perspectives from which the phrase originated.)
Today there is no top AI scientist that will tell you LLMs are just stochastic parrots.
It's like saying pianos are not creative because they don't make music. Well, yes, you have to play the keys to hear the music, and transformers are no exception. You need to put in your unique magic input to get something new and useful.
Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.
Earlier this year google shared that one of their projects (I think it was alphaevolve) found an optimisation in their stack that sped up their real world training runs by 1%. As we're talking about google here, we can be pretty sure it wasn't some trivial python trick that they missed. Anyhow, at ~100M$ / training run, that's a 1M$ save right there. Each and every time they run a training run!
And in the past month google also shared another "agentic" workflow where they had gemini2.5-fhash! (their previous gen "small" model) work autonomously on migrating codebases to support aarch64 architecture. There they found ~30% of the projects worked flawlessly end-to-end. Whatever costs they save from switching to ARM will translate in real-world $ saved (at google scale, those can add up quickly).
“Optimize” in a vacuum is a tarpit for an LLM agent today, in my view. The Google case is interesting but 1% while significant at Google scale doesn’t move the needle much in terms of statistical significance. It would be more interesting to see the exact operation and the speed up achieved relative to the prior version. But it’s data contrary to my view for sure. The cynic also notes that Google is in the LLM hype game now, too.
Strong disagree on the reasoning here. Especially since google is big and have thousands of developers, there could be a lot of code and a lot of low hanging fruit.
The message I replied to said "if I have some toy poorly optimized python example". I think it's safe to say that matmul & kernel optimisation is a bit beyond a small python example.
Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.
When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. Regardless, all of this is definitely true of humans too.
Generally, I use LLMs routinely on queries definitely no-one has written about. Are there similar texts out there that the LLM can put together and get the answer by analogy? Sure, to a degree, but at what point are we gonna start calling that intelligent? If that's not generalisation I'm not sure what is.
To what degree can you claim as a human that you are not just imitating knowledge patterns or problem-solving patterns, abstract or concrete, that you (or your ancestors) have seen before? Either via general observation or through intentional trial-and-error. It may be a conscious or unconscious process, many such patterns get backed into what we call intuition.
Are LLMs as good as humans at this? No, of course, sometimes they get close. But that's a question of degree, it's no argument to claim that they are somehow qualitatively lesser.
I haven't.
I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.
In the context of the paragraph you quoted, that’s an important distinction.
It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.
This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.
> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots
> In 2025 finally almost everybody stopped saying so.
There is still no evidence that LLMs are anything beyond "stochastic parrots". There is no proof of any "understanding". This is seeing faces in clouds.
> I believe improvements to RL applied to LLMs will be the next big thing in AI.
With what proof or evidence? Gut feeling?
> Programmers resistance to AI assisted programming has lowered considerably.
Evidence is the opposite, most developers do not trust it. https://survey.stackoverflow.co/2025/ai#2-accuracy-of-ai-too...
> It is likely that AGI can be reached independently with many radically different architectures.
There continues to be no evidence beyond "hope" that AGI is even possible, yet alone that Transformer models are the path there.
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
Again, nothing more than a gut feeling. Much like all the other AI hype posts this is nothing more than "well LLMs sure are impressive, people say they're not, but I think they're wrong and we will make a machine god any day now".
The RL claims are also odd because, for starters, RLHF is not "reinforcement learning" based on any classical definition of RL (which almost always involve an online component). And further, you can chat with anyone who has kept up with the RL field, and quickly realize that this is also a technology that still hasn't quite delivered on the promises it's been making (despite being an incredibly interesting area of research). There's no reason to speculate that RL techniques will work with "agents" where they have failed to achieve wide spread success in similar domains.
I continue to be confused why smart, very technical people can't just talk about LLMs honestly. I personally think we'd have much more progress if we could have conversations like "Wow! The performance of a Markov Chain with proper state representation is incredible, let's understand this better..." rather than "AI is reasoning intelligently!"
I get why non-technical people get caught up in AI hype discussions, but for technical people that understand LLMs it seems counter productive. Even more surprising to me is that this hype has completely destroyed any serious discussions of the technology and how to use it. There's so much oppurtunity lost around practical uses of incorporating LLMs into software while people wait for agents to create mountains of slop.
Because those smart people are usually low-rung employees while their bosses are often AI fanatics. Were they to express anti-AI views, they would be fired. Then this mentality slips into their thinking outside of work.
Real-world computers (the ones we use) are literally finite state machines
Any real-world deterministic thing can be encoded as a FSM if you make your state space big enough, since it by definition there has only a finite number of states.
Your computer is strictly more computationally powerful than an FSM or PDA, even though you could represent particular states of your computer this way.
The fact that you can model an arbitrary CFG as an regular language with limited recursion depth does not mean there’s no meaningful distinction between regular languages and CFG.
You cannot execute arbitrary programs with your PC, your PC is limited in how much memory and storage it has access to.
>Your computer is strictly more computationally powerful
The abstract computer is, but _your_ computer is not.
>model an arbitrary CFG as an regular language with limited recursion depth does not mean there’s no meaningful distinction between regular languages and CFG
Yes this I agree. But going back to your argument, claiming that LLMs with a fixed context-window are basically markov chains so they can't do anything useful is reductio ad absurdum in the exact same way as claiming that real-world computers are finite state machines.
A more useful argument on the upper-bound of computational power would be along the lines of circuit complexity I think. But even this does not really matter. An LLM does not need to be turing complete even conceptually. When paired with tool-use, it suffices that the LLM can merely generate programs that are then fed into an interpreter. (And the grammar of turing-complete programming languages can be made simple enough, you can encode Brainfuck in a CFG). So even if an LLM could only ever produce programs with a CFG grammar, the combination of LLM + brainfuck executor would give turing completeness.
Edit: There was this recent HN article along those lines. https://news.ycombinator.com/item?id=46267862.
I never claimed that. They demonstrate just how powerful Markov chains can be with sophisticated state representations. Obviously LLMs are useful, I have never claimed otherwise.
Additionally, it doesn’t require any logical leaps to understand decoder only LLMs as Markov Chains, they preserve the Markov Property and otherwise be have exactly like them. It’s worth noting that encoder-decoder LLMs do not preserve the Markov property and can not be considered Markov chains.
Edit: I saw that post and at the time was disappointed by how confused the author was about those topics and how they apply to the subject.
That's a weird thing to end on. Surely it's worth more than one sentence if you're serious about it? As it stands, it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks.
If AI is really that powerful and I should care about it, I'd rather hear about it without the scare tactics.
Oil companies: we are causing global warming with all this carbon emissions, are you scared yet? so buy our stock
Pharma companies: our drugs are unsafe, full of side effects, and kill a lot of people, are you scared yet? so buy our stock
Software companies: our software is full of bugs, will corrupt your files and make you lose money, are you scared yet? so buy our stock
Classic marketing tactics, very effective.
Also "my product will kill you and everyone you care about" is not as great a marketing strategy as you seem to imply, and Big Tech CEOs are not talking about risks anymore. They currently say things like "we'll all be so rich that we won't need to work and we will have to find meaning without jobs"
There is plenty of material on the topic. See for example https://ai-2027.com/ or https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a...
Despite the humanoids' benign appearance and mission, Underhill soon realizes that, in the name of their Prime Directive, the mechanicals have essentially taken over every aspect of human life.
No humans may engage in any behavior that might endanger them, and every human action is carefully scrutinized. Suicide is prohibited. Humans who resist the Prime Directive are taken away and lobotomized, so that they may live happily under the direction of the humanoids.
~ https://en.wikipedia.org/wiki/With_Folded_Hands_...Google for non-AI sources. Ask several models to get a wider range of opinions. Apply one’s own reasoning capabilities where applicable. Remain skeptical in the absence of substantive evidence.
Basically, do what you did before LLMs existed, and treat LLM output like you would have a random anonymous blog post you found.