But maybe that's ASI. Whereas I consider chatgpt 3 to be "baby AGI". That's why it became so popular so fast.
ChatGPT became popular because it was easy to use and amusing. (LLM UX until then had been crappy.)
Not sure AGI aspirations had anything to do with uptake.
I don't have an opinion on whether ChatGPT qualifies as AGI. What I'm saying is where one stands on that question has nothing to do with "why it became so popular so fast."
(Also, several machine-learning techniques could do millions of things terribly before LLMs. GPT does them, and other things, less poorly. It's a broadening. But I suppose really any intelligence of any kind can be considered a "baby" AGI.)
The "ChatGPT" web app started with the underlying model GPT-3.5
The predecessor models, a whole series of them collectively "GPT-3" but sold under API with names like "davinci" and "ada", was barely noticed outside AI research circles.
3 was useful, but you had to treat it as a text completion system not a chat interface, your prompt would have been e.g.
Press release
Subject: President announces imminent asteroid impact, evacuation of Florida
My fellow Americans,
Because if you didn't put "My fellow Americans," in there, it would then suggest a bunch of other press release subjects.Edit: toned down the preachiness.
Edit due to rate-limiting, which in turn appears to be due to the inexplicable downvoting of my question: since you (JumpCrisscross) are imputing a human-like motivation to the model, it sounds like you're on the side of those who argue that AGI has already been achieved?
Lying != fallibility.
Is it about jobs/tasks, or cognitive capabilities? The majority of the AI-valley seems to focus on the former, TFA focuses on the latter.
Can it do tasks, or jobs? Jobs are bundles of tasks. AI might be able to do 90% of tasks for a given job, but not the whole job.
If tasks, what counts as a task: Is it only specific things with clear success criteria? That's easier.
Is scaffolding allowed: Does it need to be able to do the tasks/jobs without scaffolding and human-written few-shot prompts?
Today's tasks/jobs only, or does it include future ones too? As tasks and jobs get automated, jobs evolve and get re-defined. So, being able to do the future jobs too is much harder.
Remote only, or in-person too: In-person too is a much higher bar.
What threshold of tasks/jobs: "most" is apparently typically understood to mean 80-95% (Mira Ariel). Automating 80% of tasks is different to 90% and 95% and 99%. diminishing returns. And how are the tasks counted - by frequency, by dollar-weighted, by unique count of tasks?
Only economically valuable tasks/jobs, or does it include anything a human can do?
A high-order bit on many people's AGI timelines is which definition of AGI they're using, so clarifying the definition is nice.
If it does an hour of tasks, but creates an additional hour of work for the worker...
1) defining intelligence is very difficult, almost impossible. Much more the artificial one
2) there are many types of human intelligence. Verbal is one of them and the closest to comparing with LLMs
3) machines (not only LLMs but all, like robots) excel where humans are bad and vice versa due to their different background, without exception. Comparing the two is totally meaningless and unfair for both. Let's have both complement the other.
4) AGI remains a valid target but we are still very far from it, like in other ones, as control the DNA and treat arbitrary genetic diseases, solve the earth resource problem and harness other planets, create a near perfect sociopolitical system with no inequality. Another Singularity is added in the list
5) I am impressed by how far a PC cluster has come up through "shuffling tokens" but on the other side I am pessimistic of how further it can reach having finate input/training data.
I can't begin to count the number of times I've encountered someone who holds an ontological belief for why AGI cannot exist and then for some reason formulates it as a behavioralist criteria. This muddying of argument results in what looks like a moving of the goalposts. I'd encourage folks to be more clear whether they believe AGI is ontologically possible or impossible in addition to any behavioralist claims.
Unclear to me what you mean. I would certainly reject an ontological possibility of intelligent computers, where computation is defined by the Church-Turing thesis. It's not rocket science, but something difficult for some people to see without a sound and basic grasp of metaphysics and the foundations of CS. Magical thinking and superstition comes more easily then. (I've already given an explanation of this in other posts ad nauseam. In a number of cases, people get argumentative out of ignorance and misunderstanding.)
However, I don't reject out of hand the possibility of computers doing a pretty good job of simulating the appearance of intelligence. There's no robust reason to think that passing the Turing test implies intelligence. A good scarecrow looks human enough to many birds, but that doesn't mean it is human.
But the Turing test is not an especially rigorous test anyway. It appeals to the discernment of the observer, which is variable, and then there's the question of how much conversation or behavior, and in what range of circumstances, you need before you can make the call. Even in some unrealistic and idealized thought experiment, if a conversation with an AI were completely indiscernible with perfect discernment from a conversation with a human being, it would nonetheless lack a causal account of what was observed. You would have only shown a perfect correlation, at best.
The "Turing test" I always saw described in literature, and the examples of what passing output from a machine was imagined to look like, are nothing like what's claimed to pass nowadays. Honestly, a lot of the people claiming that contemporary chatbots pass come across like they would have thought ELIZA passed.
With today's chat bots, it's absolutely trivial to tell that you're not talking to a real human. They will never interrupt you, continue their train of thought even thought you're trying to change the conversation, go on a complete non-sequitur, swear at you, etc. These are all things that the human "controls" should be doing to prove to the judges that they are indeed human.
LLMs are nowhere near beating the Turing test. They may fool some humans in some limited interactions, especially if the output is curated by a human. But left alone to interact with the raw output for more than a few lines, and if actively seeking to tell if you're interacting with a human or an AI (instead of wanting to believe), there really is no chance you'd be tricked.
So in that sense it's a triviality. You can ask ChatGPT whether it's human and it will say no upfront. And it has various guardrails in place against too much "roleplay", so you can't just instruct it to act human. You'd need a different post-training setup.
I'm not aware whether anyone did that with open models already.
Post training them to speak like a bot and deny being human has no effect on how useful they are. That's just an Open AI/Google/Anthropic preference.
>If you take the raw model, it will actually be much worse at the kinds of tasks you want it to perform
Raw models are not worse. Literally every model release paper that compares both show them as better at benchmarks, if anything. Post training degrading performance is a well known phenomena. What they are is more difficult to guide/control. Raw models are less useful because you have to present your input in certain ways, but they are not worse performers.
It's besides the point anyways because again, you don't have to post train them to act as anything other than a human.
>If their behavior needs to be restricted to actually become good at specific tasks, then they can't also be claimed to pass the Turing test if they can't within those same restrictions.
Okay, but that's not the case.
This is exactly what I was referring to.
But that is exactly the point of the Turing test.
If someone really wants to see a Turing-passing bot, I guess someone could try making one but I'm doubtful it would be of much use.
Anyways,people forget that the thought experiment by Turing was a rhetorical device, not something he envisioned to build. The point was to say that semantic debates about "intelligence" are distractions.
When deepmind was founded (2010) their definition was the following: AI is a system that learns to perform one thing; AGI is a system that learns to perform many things at the same time.
I would say that whatever we have today, "as a system" matches that definition. In other words, the "system" that is say gpt5/gemini3/etc has learned to "do" (while do is debateable) a lot of tasks (read/write/play chess/code/etc) "at the same time". And from a "pure" ML point, it learned those things from the "simple" core objective of next token prediction (+ enhancements later, RL, etc). That is pretty cool.
So I can see that as an argument for "yes".
But, even the person who had that definition has "moved the goalposts" of his own definition. From recent interviews, Hassabis has moved towards a definition that resembles the one from this paper linked here. So there's that. We are all moving the goalposts.
And it's not a recent thing. People did this back in the 80s. There's the famous "As soon as AI does something, it ceases to be AI" or paraphrased "AI is everything that hasn't been done yet".
What counts as a "thing"? Because arguably some of the deep ANNs pre-transfomers would also qualify as AGI but no one would consider them intelligent (not in the human or animal sense of intelligence).
And you probably don't even need fancy neural networks. Get a RL algorithm and a properly mapped solution space and it will learn to do whatever you want as long as the problem can be mapped.
----
In 2010, one of the first "presentations" given at Deepmind by Hassabis, had a few slides on AGI (from the movie/documentary "The Thinking Game"):
Quote from Shane Legg: "Our mission was to build an AGI - an artificial general intelligence, and so that means that we need a system which is general - it doesn't learn to do one specific thing. That's really key part of human intelligence, learn to do many many things".
Quote from Hassabis: "So, what is our mission? We summarise it as <Build the world's first general learning machine>. So we always stress the word general and learning here the key things."
And the key slide (that I think cements the difference between what AGI stood for then, vs. now):
AI - one task vs. AGI - many tasks
at human level intelligence.
For reference, the average chesscom player is ~900 elo, while the average FIDE rated player is ~1600. So, yeah. Parrot or not, the LLMs can make moves above the average player. Whatever that means.
At first, just playing chess was considered to be a sign of intelligence. Of course, that was wrong, but not obvious at all in 1950.
When I was in college ~25 years ago, I took a class on the philosophy of AI. People had come up with a lot of weird ideas about AI, but there was one almost universal conclusion: that the Turing test is not a good test for intelligence.
The least weird objection was that the premise of the Turing test is unscientific. It sees "this system is intelligent" as a logical statement and seeks to prove or disprove it in an abstract model. But if you perform an experiment to determine if a real-world system is intelligent, the right conclusion for the system passing the test is that the system may be intelligent, but a different experiment might show that it's not.
> we ground our methodology in Cattell-Horn-Carroll theory, the most empirically validated model of human cognition.
You could easily write the reverse of this paper that questions whether human beings have general intelligence by listing all the things that LLMs can do, which human beings can't -- for example producing a reasonably accurate summary of a paper in a few seconds or speaking hundreds of different languages with reasonable fluency.
You can always cherry pick stuff that humans are capable that LLMs are not capable of and vice versa, and and I don't think there is any reason to privilege certain capabilities over others.
I personally do not believe that "General Intelligence" exists as a quantifiable feature of reality, whether in humans or machines. It's phlogiston, it's the luminiferous ether. It's a dead metaphor.
I think what is more interesting is focusing on _specific capabilities_ that are lacking and how to solve each of them. I don't think it's at all _cheating_ to supplement LLM's with tool use, RAG, the ability to run python code. If intelligence can be said to exist at all, it is as part of a system, and even human intelligence is not entirely located in the brain, but is distributed throughout the body. Even a lot of what people generally think of as intelligence -- the ability to reason and solve logic and math problems typically requires people to _write stuff down_ -- ie, use external tools and work through a process mechanically.
A team of humans can and will make a GPT-6. Can a team of GPT-5 agents make GPT-6 all on its own if you give it the resources necessary to do so?