Posted by Tomte 13 hours ago
This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.
I really don't get the point of what I just read.
Model reasoning is on an s-curve, which is improving.
Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.
See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted. Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs. Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans on the intelligence axis to replace all labor.
No definitely not saying this and I don’t quite know what it means
> Model reasoning is on an s-curve, which is improving.
Is this saying two different things? I think I might agree with this in principle as in maybe there is some sort of s curve or something like it but do we see evidence of this? Where?
> Model intelligence is not the same as reasoning. It's a different axis, and one I have not seen much movement on.
Can you clarify this? What is the distinction and what makes you say you have “not seen much progress?”
> See, humans have a recursive form of intelligence which is capable of self-reflection and introspection. LLMs can only reason about tokens which have already been emitted
LLMs do self reflection and introspection in context, and tweaks such as value functions (serving a similar purpose to intuition or emotion) may make this better? Why do you feel self reflection and introspection are a fundamental limitation here? Models reason over tokens they have emitted and also with their own sense and learned behavior already. Are you just talking about continual learning? Also I feel people just latch onto LLMs as if this is all of AI. Why? SSMs, memory networks, recurrent neural networks etc etc etc are all part of AI but aren’t as popular because they can’t yet compete with LLMs in terms of scaling laws and training efficiency due to e.g. hardware and software optimization and investment being focused on LLMs. If something else comes along that works better we’ll just start scaling that.
> Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs.
Very strong statement, any theoretical or experimental basis for this? I also don’t particularly care personally other than as a point of curiosity. Why does it matter if AI systems will develop equivalent reasoning mechanisms as humans? In fact it may be much better not to.
> Therefore it is a mistake to assume that continual improvement on the reasoning scale will result in something that is equivalent enough to humans to replace all labor.
Idk I didn’t say this explicitly but I also dont think it matters if we have a system “equivalent to humans” or one that “replaces all labor”.
I am making that argument that how we measure model intelligence is flawed, and we are actually measuring something that is closer to "reasoning" than "intelligence". If you want evidence, we'll need a different form of tests, but how about I just gesture at the fact that GPT supposedly outscored PhDs on a broad range of subjects at least a year ago and to date is not replacing PhD jobs.
We see this pattern of high scores on tests but mediocre performance in the real world all over the place. From that, I draw the conclusion that it can reason like a PhD, but it can't think like a PhD.
So, we may see an s-curve on the measure of model reasoning but that doesn't imply they will overtake us or even match us on measures of intelligence.
As to your other questions:
> LLMs do self reflection and introspection in context,
> Why do you feel self reflection and introspection are a fundamental limitation here? Models reason over tokens they have emitted and also with their own sense and learned behavior already. Are you just talking about continual learning?
I disagree that models are reflecting and introspecting in a way equivalent to human intelligence here. They can reason over tokens which have been emitted, but by the same measure they cannot reason over tokens which have not been emitted. It's hard to make this point without drawing some diagrams, but I believe that human intelligence has internal loops, where many ideas may be turned over simultaneously before an action is taken. In comparison, an LLM might "feel uncertain" about a token before emitting it, but once it is emitted that uncertainty and the other near neighbor options are lost and the LLM is locked into the track that was set by the top-choice token. I think this is where hallucinations arise from, amongst other issues.
Context isn't sufficient for an internal reasoning loop because the tokens that compose context lose a lot of the information the network itself generated when picking those tokens. They occupy a much lower dimensional space than the "internal reasoning" processes of the transformer do.
>> Humans and LLMs do not share the same form of reasoning, and general human-like intelligence will not arise from the current architecture of LLMs.
> Very strong statement, any theoretical or experimental basis for this?
It's just my theory, but this is what I have been gesturing at. You already know about RNNs so I'll put it in those terms: the core of an intelligent network should be an RNN, not a transformer, but we fundamentally cannot train a network like that to work like an LLM because backprop doesn't work when there is infinite recursion and without being able to bootstrap off of the knowledge and reasoning baked into human text, there's no sufficient source of training material beyond being embodied.
---
EDIT:
I missed this, which I also want to reply to:
> Why does it matter if AI systems will develop equivalent reasoning mechanisms as humans? In fact it may be much better not to.
I actually agree that it may be better if they did not develop equivalent reasoning, but I don't see a world in which machines replace human labor without being intellectually equivalent.
As I think about it though, "dumb" machines which can following reasoning but not think like humans are a rather scary proposition, honestly. Seems like a tool that would be wielded without restraint by those in power to control those who aren't.
> But those skeptics are initially responding to the constant AI hype claims that we are exponentially growing to AGI.
This is a meaningless statement or at best just strawmanning.
The evidence is just whatever it is - we cannot make predictions with it.
As for the basis of your objection, this smacks of intellectual gatekeeping. Plenty of good writing is by people who are not academically qualified or a recognized expert in the topic they're writing about. Indeed, very often, this kind of writing is better than writing by experts. Experts often write for other experts, and this can be exclusionary to lay readers. When a non-expert learns about a topic then writes about it for a general audience, they tend to be just a step ahead of the audience, and so the reader is able to learn about the topic by following the process of discovery and reasoning that the author just experienced. Sure, they often get some details or concepts wrong, but the discussion on a site like HN can draw other perspectives, and – very often – contributions from experts, which leads to further expansion in everyone's understanding of the topic.
HN's very ethos is to gratify intellectual curiosity, and this kind of writing is highly compatible with that.
- Making connections to other subjects that an expert would miss. The hall of fame of sigmoid predictions is just excellent, I already know I'm going to be reminded of it some time in the future. Very entertaining way to get the point across.
- Writing about tricky concepts in a very accessible and elegant way, which experts are notoriously bad at doing themselves - they are often optimizing for other specialists.
- Being able to write with an air of speculation and experimentation with ideas that experts and institutions often can't afford. Experts have to maintain their track record; Scott Alexander can say "lol just double the timeline"
It's good that you come to HN expecting high standards of content and discussion.
> sCotT aLexAndEr
This counts as a sneer, which is against the guidelines (hn@ycombinator.com). You may not owe the writer anything but you owe the audience better than this.
> as close as you can come to the modern dressed up version of a eugenicist
Their writing about genetic determinism is a turnoff to me too. But this essay is about a different topic, and a piece of writing by a writer who is known for writing substantively about a variety of topics should be evaluated on its own terms.
Allowing slop articles like this literally prints them evaluation money.