Top
Best
New

Posted by danielfalbo 3 days ago

Reflections on AI at the End of 2025(antirez.com)
238 points | 358 commentspage 2
jimmydoe 3 days ago|
> * The fundamental challenge in AI for the next 20 years is avoiding extinction.

sorry, I say it's folding the laundry. with an aging population, that's the most, if not only, useful thing.

abricq 3 days ago||
> * Programmers resistance to AI assisted programming has lowered considerably. Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway: now the return on the investment is acceptable for many more folks.

Could not agree more. I myself started 2025 being very skeptical, and finished it very convinced about the usefulness of LLMs for programming. I have also seen multiple colleagues and friends go through the same change of appreciation.

I noticed that for certain task, our productivity can be multiplied by 2 to 4. So hence comes my doubts: are we going to be too many developers / software engineers ? What will happen for the rests of us ?

I assume that other fields (other than software-related) should also benefits from the same productivity boosts. I wonder if our society is ready to accept that people should work less. I think the more likely continuation is that companies will either hire less, or fire more, instead of accepting to pay the same for less hours of human-work.

danielfalbo 3 days ago||
> Are we going to be too many developers / software engineers ? What will happen for the rests of us?

I propose that we should raise the bar for the quality of software now.

throw1235435 2 days ago|||
I don't think that will happen because it hasn't for other technological improvements. In the end people pay for "good enough" and that's that. If "good enough" is now cheaper to implement that's all they will do. I've seen it in other technologies. As an example due to more precise manufacturing many manufacturers have used it to cheapen things like cars, electronics, etc just to the point where it passes warranty mostly; in the old days they had to "overbuild" to get it to that point putting more quality into the product.

Quality is a risk mitigation strategy; if software is disposable just like cheap manufactured goods most people won't pay for it thinking they can just "build another one". What we don't realise is due to sheer cost of building software we've wanted quality because its too expensive to fix later; AI could change that.

Hoping we invest in quality, more software (which has a price inelastic curve mostly due to scale/high ROI) etc I'm starting to think is just false hope from people in the tech industry that want to be optimistic which generally is in our nature. Tech people understand very little about economics most of the time and how people outside tech (your customers) generally operate. My reflection is mostly I need to pivot out of software; it will be commoditized.

abricq 3 days ago|||
Yes, certainly agree. A few days ago here there was this blog claiming how formal verification would become widely more used with AI. The author claiming that AI will help us with the difficulty barrier to write formal proofs.
throw1235435 2 days ago|||
I'm not sure that it will scale to other fields other than coding and math. The approach with RLVR makes it more amenable to STEM fields in general and most jobs believe it or not aren't that. The level of open source software with good test suites effectively gave them all the training material they needed; most professions won't provide that knowing that they will be giving their moat away. LLM's to other fields from my understanding still exhibit the same hallucination rates if only mildly improved especially if there isn't public internet material in that field.

We have to accept in the end that coding/SWE is one of the most disrupted fields from this breed of AI. Disruption unfortunately probably means less jobs overall. The profession is on trend to disrupting and automating itself I think; plan accordingly. I've seen so many articles claiming its great we didn't learn to code now; that's what the AI's have done.

antihipocrat 3 days ago||
I like to think of it as adding new lanes to a highway. More will be delivered until it all jams up again.
roughly 3 days ago||
> A few well known AI scientists believe that what happened with Transformers can happen again, and better, following different paths, and started to create teams, companies to investigate alternatives to Transformers and models with explicit symbolic representations or world models.

I’m actually curious about this and would love pointers to the folks working in this area. My impression from working with LLMs is there’s definitely a “there” there with regards to intelligence - I find the work showing symbolic representation in the structure of the networks compelling - but the overall behavior of the model seems to lack a certain je ne sais quoi that makes me dubious that they can “cross the divide,” as it were. I’d love to hear from more people that, well, sais quoi, or at least have theories.

pton_xd 3 days ago||
> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.

It's interesting that Terrence Tao just released his own blog post stating that they're best viewed as stochastic generators. True he's not an AI researcher, but it does sound like he's using AI frequently with some success.

"viewing the current generation of such tools primarily as a stochastic generator of sometimes clever - and often useful - thoughts and outputs may be a more productive perspective when trying to use them to solve difficult problems" [0].

[0] https://mathstodon.xyz/@tao/115722360006034040

jdub 2 days ago||
I get the impression that folks who have a strong negative reaction to the phrase "stochastic parrot" tend to do so because they interpret it literally or analogously (revealed in their arguments against it), when it is most useful as a metaphor.

(And, in some cases, a desire to deny the people and perspectives from which the phrase originated.)

antirez 3 days ago||
What happened recently is that all the serious AI researches that were in the stochastic parrot side changed point of view but, incredibly, people without a deep understanding on such matters, previously exposed to such arguments, are lagging behind and still repeat arguments that the people who popularized them would not repeat again.

Today there is no top AI scientist that will tell you LLMs are just stochastic parrots.

emp17344 3 days ago|||
You seem to think the debate is settled, but that’s far from true. It’s oddly controlling to attempt to discredit any opposition to this viewpoint. There’s plenty of research supporting the stochastic view of these models, such as Apple’s “Illusion” papers. Tao is also a highly respected researcher, and has worked with these models at a very high level - his viewpoint has merit as well.
visarga 3 days ago||||
The stochastic parrot framing makes some assumptions, one of them being that LLMs generate from minimal input prompts, like "tell me about Transformers" or "draw a cute dog". But when input provides substantial entropy or novelty, the output will not look like any training data. And longer sessions with multiple rounds of messages also deviate OOD. The model is doing work outside its training distribution.

It's like saying pianos are not creative because they don't make music. Well, yes, you have to play the keys to hear the music, and transformers are no exception. You need to put in your unique magic input to get something new and useful.

geraneum 3 days ago|||
Now that you’re here, what do you mean by “scientific hints” in your first paragraph?
piker 3 days ago||
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.

Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.

NitpickLawyer 3 days ago||
> but this won’t work in any non-trivial case

Earlier this year google shared that one of their projects (I think it was alphaevolve) found an optimisation in their stack that sped up their real world training runs by 1%. As we're talking about google here, we can be pretty sure it wasn't some trivial python trick that they missed. Anyhow, at ~100M$ / training run, that's a 1M$ save right there. Each and every time they run a training run!

And in the past month google also shared another "agentic" workflow where they had gemini2.5-fhash! (their previous gen "small" model) work autonomously on migrating codebases to support aarch64 architecture. There they found ~30% of the projects worked flawlessly end-to-end. Whatever costs they save from switching to ARM will translate in real-world $ saved (at google scale, those can add up quickly).

piker 3 days ago|||
The second example has nothing to do with the first. I am optimistic that LLMs are great for translations with good testing frameworks.

“Optimize” in a vacuum is a tarpit for an LLM agent today, in my view. The Google case is interesting but 1% while significant at Google scale doesn’t move the needle much in terms of statistical significance. It would be more interesting to see the exact operation and the speed up achieved relative to the prior version. But it’s data contrary to my view for sure. The cynic also notes that Google is in the LLM hype game now, too.

NitpickLawyer 3 days ago||
Why do you think it's not relevant to the "optimise in a loop" thing? The way I think of it, it's using LLMs "in a loop" to move something from arch A (that costs x$) to arch B (that costs y$), where y is cheaper than x. It's still an autonomous optimisation done by LLMs, no?
piker 3 days ago||
Did the LLM suggest moving to the new architecture? If not that’s not what’s under discussion. That’s just following an order to translate.
NitpickLawyer 3 days ago||
Ah, I see your point.
Jaxan 3 days ago|||
> As we're talking about google here, we can be pretty sure it wasn't some trivial python trick that they missed.

Strong disagree on the reasoning here. Especially since google is big and have thousands of developers, there could be a lot of code and a lot of low hanging fruit.

NitpickLawyer 3 days ago||
> By finding smarter ways to divide a large matrix multiplication operation into more manageable subproblems, it sped up this vital kernel in Gemini’s architecture by 23%, leading to a 1% reduction in Gemini's training time.

The message I replied to said "if I have some toy poorly optimized python example". I think it's safe to say that matmul & kernel optimisation is a bit beyond a small python example.

andy99 3 days ago|||
There was a discussion the other day where someone asked Claude to improve a code base 200x https://news.ycombinator.com/item?id=46197930
exitb 3 days ago||
That’s most definitely not the same thing, as „improving a codebase” is an open ended task with no reliable metrics the agent could work against.
dist-epoch 3 days ago||
https://github.com/algorithmicsuperintelligence/openevolve
piker 3 days ago||
https://chatgpt.com/backend-api/estuary/public_content/enc/e...
a_bonobo 3 days ago||
>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.

Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.

oersted 3 days ago||
LLMs certainly struggle with tasks that require knowledge that is not provided to them (at significant enough volume/variance to retain it). But this is to be expected of any intelligent agent, it is certainly true of humans. It is not a good argument to support the claim that they are Chinese Rooms (unthinking imitators). Indeed, the whole point of the Chinese Room thought experiment was to consider if that distinction even mattered.

When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. Regardless, all of this is definitely true of humans too.

feverzsj 3 days ago||
In most cases, LLMs has the knowledge(data). They just can't generalize them like human do. They can only reflect explicit things that are already there.
oersted 3 days ago||
I don't think that's true. Consider that the "reasoning" behaviour trained with Reinforcement Learning in the last generation of "thinking" LLMs is trained on quite narrow datasets of olympiad math / programming problems and various science exams, since exact unambiguous answers are needed to have a good reward signal, and you want to exercise it on problems that require non-trivial logical derivation or calculation. Then this reasoning behaviour gets generalised very effectively to a myriad of contexts the user asks about that have nothing to do with that training data. That's just one recent example.

Generally, I use LLMs routinely on queries definitely no-one has written about. Are there similar texts out there that the LLM can put together and get the answer by analogy? Sure, to a degree, but at what point are we gonna start calling that intelligent? If that's not generalisation I'm not sure what is.

To what degree can you claim as a human that you are not just imitating knowledge patterns or problem-solving patterns, abstract or concrete, that you (or your ancestors) have seen before? Either via general observation or through intentional trial-and-error. It may be a conscious or unconscious process, many such patterns get backed into what we call intuition.

Are LLMs as good as humans at this? No, of course, sometimes they get close. But that's a question of degree, it's no argument to claim that they are somehow qualitatively lesser.

jmfldn 3 days ago|||
"In 2025 finally almost everybody stopped saying so."

I haven't.

dist-epoch 3 days ago||
Some people are slower to understand things.
yeasku 20 hours ago|||
That is why they need artificial inteligence
jmfldn 3 days ago|||
Well exactly ;)
barnabee 3 days ago|||
I don’t think this is quite true.

I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.

In the context of the paragraph you quoted, that’s an important distinction.

It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.

This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.

Kiro 3 days ago||
Give me an example of one of those rare or unusual tasks.
a_bonobo 2 days ago|||
I work on a few HPC systems with unusual, kinda custom-rolled architectures. A whole bunch of Python and R packages fail to compile on these systems. There's no publicly accessible documentation for these HPC systems, nor for these custom architectures. ChatGPT and Claude so far have given me only wrong advice on how to get around these compilation errors and there's not much on Google for these errors, but HPC staff usually knew what to do.
recursive 3 days ago|||
Set the font size of a simple field in openxml. Doesn't even seem that rare. It said to add a run inside and set the font there. Didn't do anything. I ended up reverse engineering the output out of ms word. This happened yesterday.
lowsong 3 days ago||
I'm impressed that such a short post can be so categorically incorrect.

> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots

> In 2025 finally almost everybody stopped saying so.

There is still no evidence that LLMs are anything beyond "stochastic parrots". There is no proof of any "understanding". This is seeing faces in clouds.

> I believe improvements to RL applied to LLMs will be the next big thing in AI.

With what proof or evidence? Gut feeling?

> Programmers resistance to AI assisted programming has lowered considerably.

Evidence is the opposite, most developers do not trust it. https://survey.stackoverflow.co/2025/ai#2-accuracy-of-ai-too...

> It is likely that AGI can be reached independently with many radically different architectures.

There continues to be no evidence beyond "hope" that AGI is even possible, yet alone that Transformer models are the path there.

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

Again, nothing more than a gut feeling. Much like all the other AI hype posts this is nothing more than "well LLMs sure are impressive, people say they're not, but I think they're wrong and we will make a machine god any day now".

crystal_revenge 3 days ago|
Strongly agree with this comment. Decoder-only LLMs (the ones we use) are literally Markov Chains, the only (and major) difference is a radically more sophisticated state representation. Maybe "stochastic parrot" is overly dismissive sounding, but it's not a fundamentally wrong understanding of LLMs.

The RL claims are also odd because, for starters, RLHF is not "reinforcement learning" based on any classical definition of RL (which almost always involve an online component). And further, you can chat with anyone who has kept up with the RL field, and quickly realize that this is also a technology that still hasn't quite delivered on the promises it's been making (despite being an incredibly interesting area of research). There's no reason to speculate that RL techniques will work with "agents" where they have failed to achieve wide spread success in similar domains.

I continue to be confused why smart, very technical people can't just talk about LLMs honestly. I personally think we'd have much more progress if we could have conversations like "Wow! The performance of a Markov Chain with proper state representation is incredible, let's understand this better..." rather than "AI is reasoning intelligently!"

I get why non-technical people get caught up in AI hype discussions, but for technical people that understand LLMs it seems counter productive. Even more surprising to me is that this hype has completely destroyed any serious discussions of the technology and how to use it. There's so much oppurtunity lost around practical uses of incorporating LLMs into software while people wait for agents to create mountains of slop.

akomtu 2 days ago|||
> why smart, very technical people can't just talk about LLMs honestly

Because those smart people are usually low-rung employees while their bosses are often AI fanatics. Were they to express anti-AI views, they would be fired. Then this mentality slips into their thinking outside of work.

krackers 3 days ago|||
>Decoder-only LLMs (the ones we use) are literally Markov Chains

Real-world computers (the ones we use) are literally finite state machines

crystal_revenge 3 days ago||
Only if the computer you use does not have memory. Definitionally if you are writing and reading from memory, you are not using an FSM.
krackers 3 days ago||
No, it can still be modeled as a finite state machine. Each state just encodes the configuration of your memory. I.e. if you have 8 bits of memory, your state space just encodes 2^8 states for each memory configuration.

Any real-world deterministic thing can be encoded as a FSM if you make your state space big enough, since it by definition there has only a finite number of states.

crystal_revenge 3 days ago||
You could model a specific instance of using your computer this way, but you could not capture the fact that you can execute arbitrary programs with your PC represented as an FSM.

Your computer is strictly more computationally powerful than an FSM or PDA, even though you could represent particular states of your computer this way.

The fact that you can model an arbitrary CFG as an regular language with limited recursion depth does not mean there’s no meaningful distinction between regular languages and CFG.

krackers 3 days ago||
> you can execute arbitrary programs with your PC represented as an FSM

You cannot execute arbitrary programs with your PC, your PC is limited in how much memory and storage it has access to.

>Your computer is strictly more computationally powerful

The abstract computer is, but _your_ computer is not.

>model an arbitrary CFG as an regular language with limited recursion depth does not mean there’s no meaningful distinction between regular languages and CFG

Yes this I agree. But going back to your argument, claiming that LLMs with a fixed context-window are basically markov chains so they can't do anything useful is reductio ad absurdum in the exact same way as claiming that real-world computers are finite state machines.

A more useful argument on the upper-bound of computational power would be along the lines of circuit complexity I think. But even this does not really matter. An LLM does not need to be turing complete even conceptually. When paired with tool-use, it suffices that the LLM can merely generate programs that are then fed into an interpreter. (And the grammar of turing-complete programming languages can be made simple enough, you can encode Brainfuck in a CFG). So even if an LLM could only ever produce programs with a CFG grammar, the combination of LLM + brainfuck executor would give turing completeness.

Edit: There was this recent HN article along those lines. https://news.ycombinator.com/item?id=46267862.

crystal_revenge 3 days ago||
> so they can't do anything useful

I never claimed that. They demonstrate just how powerful Markov chains can be with sophisticated state representations. Obviously LLMs are useful, I have never claimed otherwise.

Additionally, it doesn’t require any logical leaps to understand decoder only LLMs as Markov Chains, they preserve the Markov Property and otherwise be have exactly like them. It’s worth noting that encoder-decoder LLMs do not preserve the Markov property and can not be considered Markov chains.

Edit: I saw that post and at the time was disappointed by how confused the author was about those topics and how they apply to the subject.

fleebee 3 days ago||
> The fundamental challenge in AI for the next 20 years is avoiding extinction.

That's a weird thing to end on. Surely it's worth more than one sentence if you're serious about it? As it stands, it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks.

If AI is really that powerful and I should care about it, I'd rather hear about it without the scare tactics.

dist-epoch 3 days ago||
Yeah, well known marketing trick that Big Companies do.

Oil companies: we are causing global warming with all this carbon emissions, are you scared yet? so buy our stock

Pharma companies: our drugs are unsafe, full of side effects, and kill a lot of people, are you scared yet? so buy our stock

Software companies: our software is full of bugs, will corrupt your files and make you lose money, are you scared yet? so buy our stock

Classic marketing tactics, very effective.

Recursing 3 days ago|||
I think https://en.wikipedia.org/wiki/Existential_risk_from_artifici... has much better arguments than the LessWrong sources in other comments, and they weren't written by Big Tech CEOs.

Also "my product will kill you and everyone you care about" is not as great a marketing strategy as you seem to imply, and Big Tech CEOs are not talking about risks anymore. They currently say things like "we'll all be so rich that we won't need to work and we will have to find meaning without jobs"

tejohnso 3 days ago|||
What makes it a scare tactic? There are other areas in which extinction is a serious concern and people don't behave as though it's all that scary or important. It's just a banal fact. And for all of the extinction threats, AI included, it's very easy to find plenty of deep dive commentary if you care.
grodriguez100 3 days ago|||
I would say yes, everyone should care about it.

There is plenty of material on the topic. See for example https://ai-2027.com/ or https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a...

emp17344 3 days ago|||
The fact that people here take AI 2027 seriously is embarrassing. The authors are already beginning to walk back these claims: https://x.com/eli_lifland/status/1992004724841906392?s=20
jowea 3 days ago||||
And I thought the rest of the thread was anxiety-inducing. Thanks for the nightmares lol.
dkdcio 3 days ago|||
fear mongering science fiction, you may as well cite Dune or Terminator
defrost 3 days ago|||
There's arguably more dread and quiet constrained horror in With Folded Hands ... (1947)

  Despite the humanoids' benign appearance and mission, Underhill soon realizes that, in the name of their Prime Directive, the mechanicals have essentially taken over every aspect of human life.

  No humans may engage in any behavior that might endanger them, and every human action is carefully scrutinized. Suicide is prohibited. Humans who resist the Prime Directive are taken away and lobotomized, so that they may live happily under the direction of the humanoids. 
~ https://en.wikipedia.org/wiki/With_Folded_Hands_...
XorNot 3 days ago||
This hardly disproves the point: no one is taking this topic seriously. They're just making up a hostile scenario from science fiction and declaring that's what'll happen.
lm28469 3 days ago|||
Lesswrong looks like a forum full of terminally online neckbeards who discovered philosophy 48 hours ago, you can dismiss most of what you read there don't worry
timmytokyo 3 days ago||
If only they had discovered philosophy. Instead they NIH their own philosophy, falling into the same ditches real philosophers climbed out of centuries ago.
VladimirGolovin 3 days ago||
This has been well discussed before, for example in this book: https://ifanyonebuildsit.com/
AdamWills 18 hours ago||
AI is being used in so many ways, you can’t even imagine it! It’s built to make things easier for users, and nowadays almost every platform uses AI in one way or another. It has become a natural part of everyday life. Recently, I used a platform for marching uniforms and was surprised to see that they also use AI for athlete measurements. Can you believe it?
russfink 3 days ago|
Practical question: when getting the AI to teach you something, eg how attention can be focused in LLMs, how do you know it’s teaching you correct theory? Can I use a metric of internal consistency, repeatedly querying it and other models with a summary of my understanding? What do you all do?
layer8 3 days ago||
> What do you all do?

Google for non-AI sources. Ask several models to get a wider range of opinions. Apply one’s own reasoning capabilities where applicable. Remain skeptical in the absence of substantive evidence.

Basically, do what you did before LLMs existed, and treat LLM output like you would have a random anonymous blog post you found.

akomtu 2 days ago||
In that case, LLMs must be written off as very knowledgeable crackpots because of their tendency to make things up. That's how we would treat a scientist who's caught making things up.
jennyholzer2 3 days ago||
[flagged]
More comments...