Top
Best
New

Posted by danielfalbo 3 days ago

Reflections on AI at the End of 2025(antirez.com)
238 points | 358 commentspage 3
AdamWills 20 hours ago|
AI is being used in so many ways, you can’t even imagine it! It’s built to make things easier for users, and nowadays almost every platform uses AI in one way or another. It has become a natural part of everyday life. Recently, I used a platform for marching uniforms and was surprised to see that they also use AI for athlete measurements. Can you believe it?
ur-whale 3 days ago||
Not sure I understand the last sentence:

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

danielfalbo 3 days ago||
I think he's referring to AI safety.

https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-lis...

grodriguez100 3 days ago||
For a perhaps easier to read intro to the topic, see https://ai-2027.com/
dkdcio 3 days ago||
or read your favorite sci-fi novel, or watch Terminator. this is pure bs by a charlatan
timmytokyo 3 days ago|||
It's a tell that he's been influenced by rationalist AI doomer gurus. And a good sign that the rest of his AI opinions should be dismissed.
chrishare 3 days ago||
He's referring to humanity, I believe
A_D_E_P_T 3 days ago||
It's ambiguous. It could go the other way. He could be referring to that oldest of science fiction tropes: The Bulterian Jihad, the human revolt against thinking machines.
AnimalMuppet 3 days ago||
Meh. I think the more likely scenario is the financial extinction of the AI companies.
ofirpress 3 days ago||
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.

Yup, this will absolutely be a big driver of gains in AI for coding in the near future. We actually built a benchmark based on this exact principle: https://algotune.io/

agumonkey 3 days ago||
There's videos about Diffusion LLMs too, apparently getting rid of the linear token generation. But I'm no ML engineer.
nephanth 3 days ago|
As someone who worked on transformer-based diffusion models before (not for language though), i can say one thing: they're hard.

Denoising diffusion models benefited a lot from the u-net, which is a pretty simple network (compared to a transformer) and very well-adapted to the denoising task. Plus diffusion on images is great to research because it's very easy to visualize, and therefore to wrap your head around

Doing diffusion on text is a great idea, but my intuition is it will prove more challenging, and probably take a while before we get something working

agumonkey 3 days ago||
Thanks. Do you see that part of the field as plateauing or ramping up (even taking into account the difficulty).

If you know labs / researchers on the topic, i'd love to read their page / papers

phlummox 2 days ago||
> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say.

But did any AI researchers actually claim there was no representation of meaning? I thought generally, the criticism of LLMs was that while they do abstract from their corpus - ie, you can regard them as having a representation of "meaning" - it's tightly and inextricably tied to the surface level representation, it isn't grounded in models of the external world, and LLMs have poor ability to transfer that knowledge to other surface encodings.

I don't know who the "certain AI researchers" are supposed to be. But the "stochastic parrot" paper by Bender et al [1] says:

> Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind.

That's a very different objection to the one antirez describes - I think he's erecting a straw man. But I'd be happy to be corrected by anyone more familiar with the research.

[1] https://dl.acm.org/doi/10.1145/3442188.3445922

antirez 2 days ago|
> Text generated by an LM is not grounded in communicative intent

This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.

Also their model requires the contrary, too: that the model does not know, semantically, what the query really means.

Stochastic Parrot has a scientific meaning, and just only observing the function of the models, it is quite evident that they were very wrong, but now we have stong evidence (via probing) that also the sentence you quoted is not correct, since the model knows the idea to express also in general terms, and features about things it is going to say much later activates a lot of tokens earlier, including conceptual features that are relevant later in the sentence / concept expressed.

You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models. The stochastic parrot does not understand the query nor is trying to reply to you in any way, it just exploits a probabilistic link among the context window and the next word. This link can be more complex than a Markov chain but must be of the same kind: lacking understanding whatsoever and communication intent (no representation of the concept / sentences that are required to reply correctly). How it is possible to believe in this, today? And, check yourself what the top AI scientists today believe about the correctness of the stochastic parrot hypothesis.

phlummox 2 days ago||
> > Text generated by an LM is not grounded in communicative intent

> This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.

That's not correct. It's clear from the surrounding paragraphs what Bender et al mean by this phrase. They mean that LLMs lack the capacity to form intentions.

> You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models.

No, I'm not. I haven't, in fact, made any claims about the "stochastic parrot". Rather, I've asked whether your characterisation of AI researchers' views is accurate, and suggested some reasons why it may not be.

Aiisnotabubble 3 days ago||
What also happens and it's irrelevant of AGI: global RL

Around the world people ask an LLM and get a response.

Just grouping and analysing these questions and solving them once centrally and then making the solution available again is huge.

Linearly solving the most asked questions and then the next one then the next will make, whatever system is behind it, smarter every day.

danielfalbo 3 days ago|
Exactly. The singularity is already here. It's just "programmers + AI" as a whole, rather than independent self-improvements of the AI.

I wonder how a "programmers + AI" self-improving loop is different from an "AI only" one.

bryanrasmussen 3 days ago|||
The AI only one presumably has a much faster response time. The singularity is thus not here because programmer time is still the bottleneck, whereas as I understand in the singularity time is no longer a bottleneck component.
yeasku 21 hours ago||||
You are all crazy.
Aiisnotabubble 3 days ago|||
AGI will be faster as it doesn't need initial question.

AGI will also be generic.

LLM is already very impressive though

register 3 days ago||
Where to understand more about how chain of thoughs really affects LLMs performance? I read the seminal paper but all it says is that it's basically another prompt engineering tecnique that improves accuracy.
HarHarVeryFunny 3 days ago|
Chain of thought, now including "reasoning", are basically a work around for the simplistic nature of the Transformer neural network architecture that all LLMs are based on.

The two main limitations of the Transformer that it helps with are:

1) A Transformer is just a fixed-size stack of layers, with a one-way flow of data through the layers from input to output. The fixed number of layers equates to how many "thought" steps the LLM can put into generating each word of output, but good responses to harder questions may require many more steps and iterative thinking...

The idea of "think step by step", aka chain of thought, is to have the model break it's response down into a sequence of steps, each building on what came before, so that the scope of each step is withing the capability of the fixed number of layers of the transformer.

2) A Transformer has extremely limited internal memory from one generated word to the next, so telling the model to go one step at a time, feeding its own output back in as input, in effect makes the model's output a kind of memory that makes up for this.

So, chain of thought prompting ultimately give the model more thinking steps (more words generated), together with memory of what it is thinking, in order to be able to generate a better response.

Fraterkes 3 days ago||
It’s interesting that half the comments here are talking about the extinction line when, now that we’re nearly entering 2026, I feel the 2027 predictions have been shown to be pretty wrong so far.
squidbeak 3 days ago||
> I feel the 2027 predictions have been shown to be pretty wrong so far

Does your clairvoyance go any further than 2027?

AnimalMuppet 3 days ago|||
I don't know that it's "clairvoyance". We're two weeks from 2026. We might be able to see somewhat more than we do now if this was going to turn into AGI by 2027.

If you assume that we're only one breakthrough away (or zero breakthroughs - just need to train harder), then the step could happen any time. If we're more than one away, though, then where are they? Are they all going to happen in the next two years?

But everybody's guessing. We don't know right now whether AGI is possible at current hardware levels. If it is N breakthroughs away, we all have our own guesses of approximately what N is.

My guess is that we are more than one breakthrough away. Therefore, one can look at the current state of affairs and say that we are unlikely to get to AGI by 2027.

jennyholzer2 3 days ago|||
> Does your clairvoyance go any further than 2027?

why are you so sensitive?

alexgotoi 3 days ago||
> * The fundamental challenge in AI for the next 20 years is avoiding extinction.

This reminded me of the Don’t look up movie where they basically gambled with the humans extinction.

gaigalas 3 days ago|
This post is a bait for enthusiasts. I like it.

> Chain of thought is now a fundamental way to improve LLM output.

That kinda proves _that LLMs back then were pretty much stochastic parrots indeed_, and the skeptics were right at the time. Today, enthusiasts agree with what they previously said: without CoT, the AI feels underwhelming, repetitive and dumb and it's obvious that something more was needed.

Just search past discussions about it, people were saying the problem would be solved with "larger models" (just repeating marketing stuff) and were oblivious to the possibility of other kinds of innovations.

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

That is a low level sick burn on whoever believes AI will be economically viable short-term. And I have to agree.

More comments...