Posted by gmays 2 days ago
The conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.
The only AI explainer youll need: https://kemendo.com/Understand-AI.html
Do current AI tools genuinely pose such risks?
Useful = great. We've made incredible progress in the past 3-5 years.
The people who are disappointed have their standards and expectations set at "science fiction".
From what I've seen, in response to that, goalposts are then often moved in the way that requires least updating of somebody's political, societal, metaphysical etc. worldview. (This also includes updates in favor of "this will definitely achieve AGI soon", fwiw.)
That's certainly not coming back.
It's not a real thing. You do not remember the goal posts ever being there.
Turing put forth a thought experiment in the early days of some discussions about "artificial" thinking machines on a very philisophical level.
Add to that, nobody who claims to have "passed" the turing test has ever done an actual example of that thought experiment, which is about taking two respondents and finding out which is human. It is NOT talking to a single respondent and deciding whether they are an LLM or not.
It also has never been considered a valid "test" of "intelligence" as it was obvious from the very very beginning that tricking a person wasn't really meaningful, as most people can be tricked by even simple systems.
ELIZA was the end of any thought around "The turing test", as it was able to "trick" tons of people and show how useless the turing thought experience was. Anyone who claims ELIZA is intelligent would be very silly.
I don't think it's fair to deride people who are disappointed in LLMs for not being AGI when many very prominent proponents have been claiming they are or soon will be exactly that.
But now, we have LLMs that can reliably beat video games like Pokemon, without any specialized training for playing video games. And those same LLMs can write code, do math, write poetry, be language tutors, find optimal flight routes from one city to another during the busy Christmas season, etc.
How does that not fit the definition of "General Intelligence"? It's literally as capable as a high school student for almost any general task you throw it at.
I'm not sure the party that "they" is referring to here, since arc-agi-3 dataset isn't released yet and labs probably have not begun targeting it. For arc-agi-2, possibly just synthetic data might have been enough to saturate the benchmark, since most frontier models do well on it yet we haven't seen any corresponding jump in multimodal skill use, with maybe the exception of "nano banana".
>lend itself well to token based “reasoning”
One could perhaps do reasoning/COT with vision tokens instead of just text tokens. Or reasoning in latent space which I guess might be even better. There have been papers on both, but I don't know if it's an approach that scales. Regardless gemini 3 / nano banana have had big gains on visual and spatial reasoning, so they must have done something to get multimodality with cross-domain transfer in a way that 4o/gpt-image wasn't able to.
For arc-agi-3, the missing pieces seem to be both "temporal reasoning" and efficient in-context learning. If they can train for this, it'd have benefits for things like tool-calling as well, which is why it's an exciting benchmark.
No; that was one, extremely limited example of a broader idea. If I point out that your machine is not a general calculator because it gives the wrong answer for six times nine, and then you fix the result it gives in that case, you have not refuted me. If I now find that the answer is incorrect in some other case, I am not "moving goalposts" by pointing it out.
(But also, what lxgr said.)
> But now, we have LLMs that can reliably beat video games like Pokemon, without any specialized training for playing video games. And those same LLMs can write code, do math, write poetry, be language tutors, find optimal flight routes from one city to another during the busy Christmas season, etc.
The AI systems that do most of these things are not "LLMs".
> It's literally as capable as a high school student for almost any general task you throw it at.
And yet embarrassing deficiencies are found all the time ("how many r's in strawberry", getting duped by straightforward problems dressed up to resemble classic riddles but without the actual gotcha, etc.).
Uh, every single example that I listed except for the 'playing video games' example is something that I regularly use frontier models to do for myself. I have ChatGPT and Gemini help me find flight routes, tutor me in Spanish (Gemini 3 is really good at this), write poetry and code, solve professional math problems (usually related to finance and trading), help me fix technical issues with my phone and laptop, etc etc.
If you say to yourself, "hey this thing is a general intelligence, I should try to throw it at problems I have generally", you'll find yourself astonished at the range of tasks with which it can outperform you.
LLMs are at most one component of the systems you refer to. Reasoning models and agents are something larger.
> If you say to yourself, "hey this thing is a general intelligence, I should try to throw it at problems I have generally", you'll find yourself astonished at the range of tasks with which it can outperform you.
Where AI has been thrust at me (search engines and YouTube video and chat summaries) it has been for the sort of thing where I'd expect it to excel, yet I've been underwhelmed. The one time I consciously invoked the "AI assist" on a search query (to do the sort of thing I might otherwise try on Wolfram Alpha) it committed a basic logical error. The project READMEs that Show HN has exposed me to this year have been almost unfailingly abominable. (Curiously, I'm actually okay with AI art a significant amount of the time.)
But none of that experience is even a hundredth as annoying as the constant insinuation from AI proponents that any and all opposition is in some way motivated by ego protection.
Okay, enough eggnog and posting.
It also seems orders of magnitude less resource efficient than higher-level approaches.