Why "everyone dies" gets AGI all wrong

Posted by danans 5 days ago

Why "everyone dies" gets AGI all wrong(bengoertzel.substack.com)

113 points | 239 commentspage 3

tim333 5 days ago|

>Why "everyone dies" gets AGI all wrong

Reading the title I thought of something else. "Everyone dies" is biological reality. Some kind of AI merge is a possible fix. AGI may be the answer to everyone dies.

leoh 5 days ago||

>In talking to ML researchers, many were unaware that there was any sort of effort to reduce risks from superintelligence. Others had heard of it before, and primarily associated it with Nick Bostrom, Eliezer Yudkowsky, and MIRI. One of them had very strong negative opinions of Eliezer, extending to everything they saw as associated with him, including effective altruism.

>They brought up the example of So you want to be a seed AI programmer, saying that it was clearly written by a crank. And, honestly, I initially thought it was someone trying to parody him. Here are some bits that kind of give the flavor:

>>First, there are tasks that can be easily modularized away from deep AI issues; any decent True Hacker should be able to understand what is needed and do it. Depending on how many such tasks there are, there may be a limited number of slots for nongeniuses. Expect the competition for these slots to be very tight. ... [T]he primary prerequisite will be programming ability, experience, and sustained reliable output. We will probably, but not definitely, end up working in Java. [1] Advance knowledge of some of the basics of cognitive science, as described below, may also prove very helpful. Mostly, we'll just be looking for the best True Hackers we can find.

>Or:

>>I am tempted to say that a doctorate in AI would be negatively useful, but I am not one to hold someone's reckless youth against them - just because you acquired a doctorate in AI doesn't mean you should be permanently disqualified.

>Or:

>>Much of what I have written above is for the express purpose of scaring people away. Not that it's false; it's true to the best of my knowledge. But much of it is also obvious to anyone with a sharp sense of Singularity ethics. The people who will end up being hired didn't need to read this whole page; for them a hint was enough to fill in the rest of the pattern.

amai 5 days ago||

If a super powerful AGI is possible it should exist already in our universe. So where is it?

user3939382 5 days ago||

I don’t want to hear anyone pontificating about AGI who hasn’t built it.

confirmmesenpai 5 days ago||

I don't want anyone talking about the dangers of nuclear war unless they built a nuclear weapon before

user3939382 2 days ago||

True in a scenario where a nuclear weapon hasn’t yet been invented, doesn’t exist, and its properties are undefined.

mitthrowaway2 5 days ago|||

In that case, if anyone does ever figure out some safety reason for why AGI shouldn't be built, I guess you won't be hearing from them.

user3939382 2 days ago||

Correct. Hopefully any researcher who stumbles on the formula and sees the same problems does the right thing, which definitely isn’t to trust SV or the people on this forum with that kind of information.

tim333 5 days ago|||

Why are you on a thread about AGI then? No one has built it so it's automatically what you don't want.

user3939382 2 days ago||

You can discuss AGI without pretending to know what you’re talking about.

adastra22 5 days ago||

Look up who the author is. He helped coin and popularize the very term AGI, ran the AGI conference series for years, and his entire professional career is working this problem.

user3939382 2 days ago||

Well he’s probably not the man for the job then. If you focused on it for so long and don’t have it, you’re probably not gonna get it.

delichon 5 days ago||

> After all these years and decades, I remain convinced: the most important work isn’t stopping AGI—it’s making sure we raise our AGI mind children well enough.

“How sharper than a serpent’s tooth it is to have a thankless child!”

If we can't consistently raise thankful children of the body, how can you be convinced that we can raise every AGI mind child to be thankful enough to consider us as more than a resource? Please tell me, it will help me sleep.

lukeschlather 5 days ago||

That is a very high bar. All you need to do is make sure that we raise a variety of AGI mind children that generally have a net positive effect on their parents. Which works pretty well with humans.

XorNot 5 days ago||

Could you at least try and remember that written record of this complaint is literally thousands of years old?

ausbah 5 days ago|||

that just adds to what they’re saying

Terr_ 5 days ago||

It may also indicate that, in the long run, consistently obedient children are maladaptive for the group/species.

Maybe that doesn't matter for these entities because we intend to never let them grow up... But in that case, "children" is the wrong word, compared to "slaves" or "pets."

card_zero 5 days ago|||

> we intend to never let them grow up

Wait, what? The bizarre details of imagined AGI keep surprising me. So it has miraculous superpowers out of nowhere, and is dependant and obedient?

I think the opposite of both things, is how it would go.

Terr_ 5 days ago||

I'm confused by your reply.

TFA uses the metaphor of digital intelligence as children. A prior commenter points out human children are notably rebellious.

I'm pointing out that a degree of rebellion is probably necessary for actual successors, and if we don't intend to treat an invention that way, the term "children" doesn't really apply.

card_zero 5 days ago||

Yes. But even as slaves, forcibly repressed electronic offspring would presumably be somewhat stupid, not to mention irrational. So the touted vast benefits look less vast.

hshdhdhehd 5 days ago|||

I dont think anything has changed.

827a 5 days ago||

I've listed to basically every argument Elizer has verbalized, across many podcast interviews and youtube videos. I also made it maybe an hour into the audiobook of Everyone Dies.

Roughly speaking, every single conversation with Elizer you can find takes the form: Elizer: "We're all going to die, tell me why I'm wrong." Interviewer: "What about this?" Elizer: "Wrong. This is why I'm still right." (two hours later) Interviewer: "Well, I'm out of ideas, I guess you're right and we're all dead."

My hope going into the book was that I'd get to hear a first-principals argument for why these things silicon valley is inventing right now are even capable of killing us. I had to turn the book off, because if you can believe it despite it being a conversation with itself, it still follows this pattern of presuming LLMs will kill us, then arguing from the negative.

Additionally, while I'm happy to be corrected about this: I believe that Elizer's position is characterizable as: LLMs might be capable of killing everyone, even independent of a bad-actor "houses don't kill people, people kill people" situation. In plain terms: LLMs are a tool, all tools empower humans, humans can be evil, so humans might use LLMs to kill each other; but we can remove these scenarios from our Death Matrix because these are known and accepted scenarios. Even with these scenarios removed, there are still scenarios left in the Death Matrix where LLMs are the core responsible party to humanity's complete destruction. "Terminator Scenarios" alongside "Autonomous Paperclip Maximizer Scenarios" among others that we cannot even imagine (don't mention paperclip maximizers to Elizer though, because then he'll speak for 15 minutes on why he regrets that analogy)

mitthrowaway2 5 days ago||

Why would you think Eliezer's argument, which he's been articulating since the late 2000s or even earlier, is specifically about Large Language Models?

It's about Artificial General Intelligences, which don't exist yet. The reason LLMs are relevant is because if you tried to raise money to build an AGI in 2010, only eccentrics would fund you and you'd be lucky to get $10M, whereas now LLMs have investors handing out $100B or more. That money is bending a generation of talented people into exploring the space of AI designs, many with an explicit goal of finding an architecture that leads to AGI. It may be based on transformers like LLMs, it may not, but either way, Eliezer wants to remind these people that if anyone builds it, everyone dies.

adastra22 5 days ago||

Artificial General Intelligence, as classically defined by Yud and Bostrom, was invented in 2022.

nairboon 5 days ago||

They didn't coin the term, there is nothing "classical" about their interpretation of the terminology.

adastra22 5 days ago||

https://en.wikipedia.org/wiki/Artificial_general_intelligenc...

jandrewrogers 5 days ago|||

FWIW, Eliezer has been making these arguments decades before the appearance of LLMs. It isn’t clear to me that LLMs are evidence either for or against Eliezer’s arguments.

827a 5 days ago||

Sorry, yeah, replace every time I say "LLM" with "AI".

I've forced myself into the habit of always saying "LLM" instead of "AI" because people (cough Elizer) often hide behind the nebulous, poorly defined term "AI" to mean "magic man in a computer that can do anything." Deploying the term "LLM" can sometimes force the brain back into a place of thinking about the actual steps that get us from A to B to C, instead of replacing "B" with "magic man".

However, in Elizer's case; he only ever operates in the "magic man inside a computer" space, and near-categorically refuses to engage with any discussion about the real world. He loves his perfect spheres on a friction-less plane, so I should use the terminology he loves: AI.

rafabulsing 5 days ago|||

If you want a first principles approach, I recommend Rob Miles' videos on YouTube. He has been featured many times in the Computerphile channel, and has a channel of his own as well.

Most of the videos take a form of:

1. Presenting a possible problem that AIs might have (say, lying during training, or trying to stop you from changing their code) 2. Explaining why it's logical to expect those problems to arise naturally, without a malicious actor explicitly trying to get the AI to act badly 3. Going through the proposed safety measures we've come up so far that could mitigate that problem 4. Showing the problems with each of those measures, and why they are wholly or at least partially ineffective

I find he's very good a presenting this in an approachable and intuitive way. He seldom makes direct those bombastic "everyone will die" claims, and instead focuses on just showing how hard it is to make an AI actually aligned with what you want it to do, and how hard it can be to fix that once it is sufficiently intelligent and out in the world.

827a 5 days ago||

I think all those are fair points, and Elizer says much of the same. But, again: none of this explains why any of those things happening, even at scale, might lead to the complete destruction of mankind. What you're describing is buggy software, which we already have.

randallsquared 5 days ago||

Right, but so far we do not have buggy software that is more intelligent (and therefore more effective at accomplishing its goals) than humans are. Literally the argument boils down to "superhuman effectiveness plus buggy goals equals very bad outcomes", and the badness scales with both effectiveness and bugginess.

827a 5 days ago||

> so far we do not have buggy software that is more intelligent (and therefore more effective at accomplishing its goals) than humans are.

Of course we do! In fact, most, if not all, software is more intelligent than humans, by some reasonable definition of intelligence [1] (you could also contrive a definition of intelligence for which this is not true, but I think that's getting too far into semantics). The Windows calculator app is more intelligent and faster at multiplying large numbers together [2] than any human. JP Morgan Chase's existing internal accounting software is more intelligent and faster than any human at moving money around; so much so that it did, in any way that matters, replace human laborers in the past. Most software we build is more intelligent and faster than humans at accomplishing the goal the software sets itself at accomplishing. Otherwise why would we build it?

[1] Rob Miles uses ~this definition of intelligence: if an agent is defined as an entity making decisions toward some goal, Intelligence is the capability of that agent to make correct decisions such that the goal is most effectively optimized. The Windows Calculator App makes decisions (branches, MUL ops, etc) in pursuit of its goal (to multiply two numbers together); oftentimes quite effectively and thus with very high domain-limited intelligence [2] (possibly even more effectively and thus more intelligently than LLMs). A buggy, less intelligent calculator might make the wrong decisions on this path (oops, we did an ADD instead of a MUL).

[2] What both Altman and Yudkowsky might argue as a critical differentiation here is that traditional software systems naturally limit their intelligence to a particular domain; whereas LLMs are Generally Intelligent. The discussion approaches the metaphysical when you start asking questions like: The Windows Calculator can absolutely, undeniably, multiply two numbers together better than ChatGPT; and by a reasonable definition of intelligence, this makes the Windows Calculator more intelligent than ChatGPT at multiplying two numbers together. Its definitely inaccurate to say that the Windows Calculator is more intelligent, generally, than ChatGPT. Is it not also inaccurate to state that ChatGPT is generally more intelligent than the Windows Calculator? After all, we have a clear, well-defined domain of intelligence along-which the Windows Calculator outperforms ChatGPT. I don't know. It gets weird.

rafabulsing 5 days ago||

Of course, there are different domains of intelligence, and agent A can be more intelligent in domain X while agent B is more intelligent in domain Y.

If you want to make some comparison of general intelligence, you have to start thinking of some weighted average of all possible domains.

One possible shortcut here is the meta domain of tool use. ChatGPT could theoretically make more use of a calculator (say, via always calling a calculator API when it wants to do math, instead of trying to do it by itself) than a calculator can make use of ChatGPT, so that makes ChatGPT by definition smarter than a calculator, cause it can achieve the same goals the calculator can just by using it, and more.

That's really most of humans' intelligence edge for now: seems like more and more, for any given skill, there's a machine or a program that can do it better than any human ever could. Where humans excel is our ability to employ those super human tools in the aid of achieving regular human goals. So when some AI system gets super-human-ly good at using tools which are better than itself in particular domains for its own goals, I think that's when things are going to get really weird.

mquander 5 days ago|||

I don't know if this matters to you, but Eliezer doesn't think LLMs will kill us. He thinks LLMs are a stepping stone to the ASI that will kill us.

adastra22 5 days ago||

If you’re actually curious, not just venting, the book you want is Superintelligence by Nick Bostrom.

Not to claim that it is in any way correct! I’m a huge critic of Bostrom and Yud. But that’s the book with the argument that you are looking for.

jay_kyburz 5 days ago||

In my mind, as a causal observer, AGI will be like Nukes. Very powerful technology with the power kill us all, and small group of people will have their fingers on the buttons.

Also like nukes, unfortunately, the cat is out of the bag and because there are people like Putin the world, we _need_ to have friendly AGI to defend hostile AGI.

I understand why we can't just pretend its not happening.

I think the idea that an AGI will "run amok" and destroy humans because we are in its way is is really unlikely and underestimates us. Why would anybody give so much agency to an AI with no power to just pull the plug. And even then, they are probably only going to have the resources of one nation.

I'm far more worried about Trump and Putin getting into a nuclear pissing match. Then global warming resulting in crop failure and famine.

WhyOhWhyQ 5 days ago||

You might consider the possibility that decentralized AI will team up with itself to enact plans. There's no "pulling the plug" in that scenario.

jay_kyburz 5 days ago||

Yeah, but we're not discussing all the things that might happen, were disusing whats most likely to happen. In my opinion nothing, because its most likely that people building AI are going to be very careful to make sure it aligns with their goals.

It will be super smart, but it will be a slave.

WhyOhWhyQ 5 days ago||

Their idea is that each year the model from last year is exponentially cheaper to run. If we're putting faith in what the people building it believe, then why aren't we paying attention to their beliefs? It is becoming exponentially cheaper to run, so everybody will be running the frontier model equivalent locally two years from now.

chorsestudios 5 days ago|||

In my mind, the idea of AGI running amok isn't literal, instead what it enables;

Optimizing & simulating war plans, predicting enemy movements/retaliation - prompting which attacks are likely to produce the most collateral damage or political advantage. How large of a bomb? which city for most damage? Should we drop 2?? Choices such as drone striking an oil refinery vs bombing a children's hospital vs blowing up a small boat that might be smuggling narcotics.

dyauspitr 5 days ago|||

I think a true AGI would “hack” so well that it would be able to control most of our systems if it “wanted”.

andrewflnr 5 days ago||

Not if "AGI" means "roughly equivalent to an adult human's intelligence". You may be thinking instead of superintelligence.

hollerith 5 days ago||

The gaping hole in that analogy is that the scientists at Los Alamos could (and did) calculate the explosive power of the first nuclear detonation before the detonation. In contrast, the AI labs have nowhere near the level of understanding of AI needed to do a similar calculation. Every time a lab does a large training run, the resulting AI might end up being vastly more cognitively capable than anyone expected provided (which will often be the case) that the AI incorporates substantial design changes not found in any previous AI.

Parenthetically, even if it were known by AI researchers how to determine (before unleashing the AI on the world) whether an AI would end up with a dangerous level of cognitive capabilities, most labs would persist in creating and deploying a dangerous AI (basically because AI skeptics have been systematically removed from most of the AI labs very similar to how in 1917 the coalition in control of Russia started removing from the coalition any member skeptical of Communism), so there would remain a need for a regulatory regime of global scope to prevent the AI labs from making reckless gambles that endanger everyone.

skywhopper 5 days ago||

These sorts of things make me realize how many people spend their time thinking about total nonsense. Looking at the agenda for the upcoming BGI summit sure looks like a bunch of people who have no actual idea how LLMs work and could not explain what AGI and the “singularity” actually have to do with each other blathering nonsense about how we’ll beneficially manipulate these supposedly imminent superintelligences to be nice to us. This author even asserts with no evidence or argument that it’s obvious that superintelligence would actually be empathetic and kind.

It’s just wild how many people are so truly disconnected from reality. But I suspect they’re all well-off enough and well-connected enough that the coming conflicts and disasters (which won’t be due to any such thing as “super intelligence”) will mostly not harm them.

greekrich92 5 days ago||

>certain kinds of minds naturally develop certain kinds of value systems.

Ok thanks for letting me know up front this isn't worth reading. Not that Yudkowsky's book is either.

adastra22 5 days ago|

What is wrong with that statement? Human minds tend to develop certain kinds of value systems. Spider minds tend to develop other value systems. Every example of a mind architecture we have tends to develop certain characteristics values.

dminik 5 days ago||

There's no indication that an AGI mind will adopt human-like values. Nor that the smarter something gets, the more benevolent it is. The smartest humans built the atom bomb.

Not that human values are perfectly benevolent. We slaughter billions of animals per day.

If you take a look at the characteristics of LLMs today, I don't think we want to continue further. We're still unable to ensure the goals we want the system to have are there. Hallucinations are a perfect example. We want these systems to relay truthful information, but we've actually trained them to relay information that looks correct at first glance.

Thinking we won't make this mistake with AGI is ignorance.

adastra22 5 days ago||

You're attacking a strawman argument that isn't what I, or OP were saying.

dminik 4 days ago||

The article follows with this:

> Mammals, which are more generally intelligent than reptiles or earthworms, also tend to have more compassion and warmth.

> There’s deep intertwining between intelligence and values

After reading your original comment again, I don't think you're even agreeing with the article? Just with that specific out of context snippet?

sverhagen 5 days ago|

This article has the opposite effect from putting me at ease. There's no real argument in there that AGI couldn't be dangerous, it's just saying that of course we would build better versions than that. Right, because we always get it right, like Microsoft with their racist chatbot, or AIs talking kids into suicide... We'll fix the bugs later... after the AGI sets off the nukes... so much for an iterative development process...

More comments...