Reflections on AI at the End of 2025

Posted by danielfalbo 12/20/2025

Reflections on AI at the End of 2025(antirez.com)

243 points | 363 commentspage 3

russfink 12/20/2025|

Practical question: when getting the AI to teach you something, eg how attention can be focused in LLMs, how do you know it’s teaching you correct theory? Can I use a metric of internal consistency, repeatedly querying it and other models with a summary of my understanding? What do you all do?

layer8 12/20/2025||

> What do you all do?

Google for non-AI sources. Ask several models to get a wider range of opinions. Apply one’s own reasoning capabilities where applicable. Remain skeptical in the absence of substantive evidence.

Basically, do what you did before LLMs existed, and treat LLM output like you would have a random anonymous blog post you found.

akomtu 12/21/2025||

In that case, LLMs must be written off as very knowledgeable crackpots because of their tendency to make things up. That's how we would treat a scientist who's caught making things up.

jennyholzer2 12/20/2025||

[flagged]

ur-whale 12/20/2025||

Not sure I understand the last sentence:

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

danielfalbo 12/20/2025||

I think he's referring to AI safety.

https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-lis...

grodriguez100 12/20/2025||

For a perhaps easier to read intro to the topic, see https://ai-2027.com/

dkdcio 12/20/2025||

or read your favorite sci-fi novel, or watch Terminator. this is pure bs by a charlatan

timmytokyo 12/20/2025|||

It's a tell that he's been influenced by rationalist AI doomer gurus. And a good sign that the rest of his AI opinions should be dismissed.

chrishare 12/20/2025||

He's referring to humanity, I believe

A_D_E_P_T 12/20/2025||

It's ambiguous. It could go the other way. He could be referring to that oldest of science fiction tropes: The Bulterian Jihad, the human revolt against thinking machines.

AnimalMuppet 12/20/2025||

Meh. I think the more likely scenario is the financial extinction of the AI companies.

ofirpress 12/20/2025||

> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.

Yup, this will absolutely be a big driver of gains in AI for coding in the near future. We actually built a benchmark based on this exact principle: https://algotune.io/

agumonkey 12/20/2025||

There's videos about Diffusion LLMs too, apparently getting rid of the linear token generation. But I'm no ML engineer.

nephanth 12/20/2025|

As someone who worked on transformer-based diffusion models before (not for language though), i can say one thing: they're hard.

Denoising diffusion models benefited a lot from the u-net, which is a pretty simple network (compared to a transformer) and very well-adapted to the denoising task. Plus diffusion on images is great to research because it's very easy to visualize, and therefore to wrap your head around

Doing diffusion on text is a great idea, but my intuition is it will prove more challenging, and probably take a while before we get something working

agumonkey 12/20/2025||

Thanks. Do you see that part of the field as plateauing or ramping up (even taking into account the difficulty).

If you know labs / researchers on the topic, i'd love to read their page / papers

Aiisnotabubble 12/20/2025||

What also happens and it's irrelevant of AGI: global RL

Around the world people ask an LLM and get a response.

Just grouping and analysing these questions and solving them once centrally and then making the solution available again is huge.

Linearly solving the most asked questions and then the next one then the next will make, whatever system is behind it, smarter every day.

danielfalbo 12/20/2025|

Exactly. The singularity is already here. It's just "programmers + AI" as a whole, rather than independent self-improvements of the AI.

I wonder how a "programmers + AI" self-improving loop is different from an "AI only" one.

bryanrasmussen 12/20/2025|||

The AI only one presumably has a much faster response time. The singularity is thus not here because programmer time is still the bottleneck, whereas as I understand in the singularity time is no longer a bottleneck component.

Aiisnotabubble 12/20/2025||||

AGI will be faster as it doesn't need initial question.

AGI will also be generic.

LLM is already very impressive though

yeasku 12/23/2025|||

You are all crazy.

phlummox 12/21/2025||

> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say.

But did any AI researchers actually claim there was no representation of meaning? I thought generally, the criticism of LLMs was that while they do abstract from their corpus - ie, you can regard them as having a representation of "meaning" - it's tightly and inextricably tied to the surface level representation, it isn't grounded in models of the external world, and LLMs have poor ability to transfer that knowledge to other surface encodings.

I don't know who the "certain AI researchers" are supposed to be. But the "stochastic parrot" paper by Bender et al [1] says:

> Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind.

That's a very different objection to the one antirez describes - I think he's erecting a straw man. But I'd be happy to be corrected by anyone more familiar with the research.

[1] https://dl.acm.org/doi/10.1145/3442188.3445922

antirez 12/21/2025|

> Text generated by an LM is not grounded in communicative intent

This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.

Also their model requires the contrary, too: that the model does not know, semantically, what the query really means.

Stochastic Parrot has a scientific meaning, and just only observing the function of the models, it is quite evident that they were very wrong, but now we have stong evidence (via probing) that also the sentence you quoted is not correct, since the model knows the idea to express also in general terms, and features about things it is going to say much later activates a lot of tokens earlier, including conceptual features that are relevant later in the sentence / concept expressed.

You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models. The stochastic parrot does not understand the query nor is trying to reply to you in any way, it just exploits a probabilistic link among the context window and the next word. This link can be more complex than a Markov chain but must be of the same kind: lacking understanding whatsoever and communication intent (no representation of the concept / sentences that are required to reply correctly). How it is possible to believe in this, today? And, check yourself what the top AI scientists today believe about the correctness of the stochastic parrot hypothesis.

phlummox 12/21/2025||

> > Text generated by an LM is not grounded in communicative intent

> This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.

That's not correct. It's clear from the surrounding paragraphs what Bender et al mean by this phrase. They mean that LLMs lack the capacity to form intentions.

> You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models.

No, I'm not. I haven't, in fact, made any claims about the "stochastic parrot". Rather, I've asked whether your characterisation of AI researchers' views is accurate, and suggested some reasons why it may not be.

Where to understand more about how chain of thoughs really affects LLMs performance? I read the seminal paper but all it says is that it's basically another prompt engineering tecnique that improves accuracy.

HarHarVeryFunny 12/20/2025|

Chain of thought, now including "reasoning", are basically a work around for the simplistic nature of the Transformer neural network architecture that all LLMs are based on.

The two main limitations of the Transformer that it helps with are:

1) A Transformer is just a fixed-size stack of layers, with a one-way flow of data through the layers from input to output. The fixed number of layers equates to how many "thought" steps the LLM can put into generating each word of output, but good responses to harder questions may require many more steps and iterative thinking...

The idea of "think step by step", aka chain of thought, is to have the model break it's response down into a sequence of steps, each building on what came before, so that the scope of each step is withing the capability of the fixed number of layers of the transformer.

2) A Transformer has extremely limited internal memory from one generated word to the next, so telling the model to go one step at a time, feeding its own output back in as input, in effect makes the model's output a kind of memory that makes up for this.

So, chain of thought prompting ultimately give the model more thinking steps (more words generated), together with memory of what it is thinking, in order to be able to generate a better response.

Fraterkes 12/20/2025||

It’s interesting that half the comments here are talking about the extinction line when, now that we’re nearly entering 2026, I feel the 2027 predictions have been shown to be pretty wrong so far.

squidbeak 12/20/2025||

> I feel the 2027 predictions have been shown to be pretty wrong so far

Does your clairvoyance go any further than 2027?

AnimalMuppet 12/20/2025|||

I don't know that it's "clairvoyance". We're two weeks from 2026. We might be able to see somewhat more than we do now if this was going to turn into AGI by 2027.

If you assume that we're only one breakthrough away (or zero breakthroughs - just need to train harder), then the step could happen any time. If we're more than one away, though, then where are they? Are they all going to happen in the next two years?

But everybody's guessing. We don't know right now whether AGI is possible at current hardware levels. If it is N breakthroughs away, we all have our own guesses of approximately what N is.

My guess is that we are more than one breakthrough away. Therefore, one can look at the current state of affairs and say that we are unlikely to get to AGI by 2027.

jennyholzer2 12/20/2025|||

> Does your clairvoyance go any further than 2027?

why are you so sensitive?

alexgotoi 12/20/2025||

> * The fundamental challenge in AI for the next 20 years is avoiding extinction.

This reminded me of the Don’t look up movie where they basically gambled with the humans extinction.

gaigalas 12/20/2025|

This post is a bait for enthusiasts. I like it.

> Chain of thought is now a fundamental way to improve LLM output.

That kinda proves _that LLMs back then were pretty much stochastic parrots indeed_, and the skeptics were right at the time. Today, enthusiasts agree with what they previously said: without CoT, the AI feels underwhelming, repetitive and dumb and it's obvious that something more was needed.

Just search past discussions about it, people were saying the problem would be solved with "larger models" (just repeating marketing stuff) and were oblivious to the possibility of other kinds of innovations.

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

That is a low level sick burn on whoever believes AI will be economically viable short-term. And I have to agree.

More comments...