AIs can't stop recommending nuclear strikes in war game simulations

Posted by ceejayoz 12 hours ago

AIs can't stop recommending nuclear strikes in war game simulations(www.newscientist.com)

197 points | 224 commentspage 3

blobbers 5 hours ago|

Is this something we could build into post training?

Some kind of RL portion of the code that reinforces de-escalation, dangers of war, nuclear destruction of both AI and human kind, radiation and it's dangers towards microchips, the atmosphere and bit flipping (just so the AI doesn't get cocky!)

keeda 6 hours ago||

BTW have we hooked our nukes up to an MCP yet?

oceanplexian 5 hours ago||

I've spoken with engineers who worked on nuclear weapons systems, the consensus is that the public is deeply misinformed about how they work, the dangers, and the implications of weapons being used. The AI is actually right here.

The biggest danger of a nuclear weapon is being hit by flying debris.

Fusion airburst bombs of the modern era are incredibly clean and radiation is only a risk in a very small area (tens of miles) for a short time (days to weeks). In a modern conflict a significant fraction of nukes would be intercepted before they reached the United States. There are far fewer of them than there were in the 1980s (A few 1000's vs 40,000). Most would be used on strategic military targets, ships, bases, etc. Not to say it would be a good time, but it wouldn't be the "end of humanity" or anything even remotely like it.

jdross 5 hours ago||

I think the consensus is the biggest danger of a nuclear weapon being used is that it will result in way more nuclear weapons being used.

The specific damage of a single nuclear weapon is far outweighed by thousands of them hitting population centers in an escalation of force

OldSchool 5 hours ago|||

Well thank you for your input General Le May but the consensus is still that zero nukes is the best choice for humans in particular.

beloch 5 hours ago|||

The more completely fissile material is used up, the higher the explosive yield, so it seems intuitive that fission and fusion bombs should have become cleaner as technology progressed. However, in many cases, even the U.S. has had to play catch-up just to reproduce what they did half a century ago. e.g. Fogbank[1] Delivery vehicles have advanced quite a bit, but the payloads themselves, perhaps not so much.

Even if we assume fission and fusion bombs have become completely efficient in using up their fissile materials, there's still the threat of nuclear winter. Nuclear winter has nothing to do with residual radioactivity. Powerful explosions loft fine particulate matter so high into the atmosphere that it takes years or decades to settle. While it's up there, it blocks sunlight and it spreads around the world. If enough bombs explode and enough sunlight is blocked, agriculture fails and the environment collapses globally. Even a completely unopposed unilateral strike, were it large enough, could doom the aggressor to starvation, social breakdown, and civilization collapse. An exchange on the other side of the planet (e.g. between China and India) poses a direct threat to the U.S., the same as every other nation.

There are people who will be happy to throw shade on the research on nuclear winter, and AI are no doubt lending them equal weight. However, even if they were just as likely to be right as the research that has highlighted these risks, is the risk worth taking? Are you willing to make that bet? An AI that doesn't reason as humans do and can't do basic math without making mistakes might say, "yes".

[1]https://en.wikipedia.org/wiki/Fogbank

Lucasoato 5 hours ago|||

> it wouldn't be the "end of humanity" or anything even remotely like it

It's very likely that a nuclear conflict between major nuclear-armed states (US, China, Russia, but it could be starting in India or Pakistan as well) would bring an end to humanity as we mean it today.

I really hope that behind all the today's communication bullshit there are deep state masterminds that do not have personal interest in dominating a doomed world.

Neil44 5 hours ago|||

Take away modern infrastructure in a flash or light and see what percentage of people are still alive in a year.

amelius 5 hours ago|||

> Fusion airburst bombs of the modern era are incredibly clean

Are all potential adversaries up to date on this?

0xbadcafebee 5 hours ago|||

So, assume 10 of them do make it through defenses. One hits Boston, NYC, Philadelphia, DC, Norfolk, Miami, Chicago, San Diego, LA, SF. That's 28 million people and most of the political, financial, administrative, logistical, shipping and naval centers.

Sure, humanity survives. But in a state akin to Europe in 1918. Massive casualties, destruction, horror, economic calamity, famine, general chaos, which will persist for at least a decade. And this would be in every major developed nation. So... perhaps it is not a good idea to use them. Perhaps the "misconception" that the world will end is the only reason they haven't been used.

jhallenworld 5 hours ago|||

>The biggest danger of a nuclear weapon is being hit by flying debris.

I thought it was being burned alive in the resulting firestorm because the intense light starts fires over a large area: way beyond the blast zone. This risk could be reduced if we painted everything white- a double win since it would also help reduce the city heat island effect.

jakobnissen 5 hours ago|||

Would they really be intercepted though? IIUC, no country on Earth has an appreciable number of antiballistic missiles, and the success rate isn’t great.

selridge 3 hours ago|||

Brother, ABM will catch like 5, tops.

actionfromafar 5 hours ago|||

A significant fraction?!

You do realize firebombing all major cities could develop into "end of humanity" (no, not everyone will die) for reasons not at all to do with radioactivity?

cantalopes 5 hours ago|||

Do you realize how evil you sound

slopinthebag 5 hours ago||

Nah they actually sound reassuring, I don't trust them but I would like to believe if some crazy president decided to start a nuclear war it wouldn't be the end of humanity.

sheiyei 5 hours ago||

When a single nuke flies, a thousand do. There's no hope in that situation

saidnooneever 5 hours ago||

all bombs are bad. nuclear bombs the worst..if you try to argue for them you are hopelessly lost.

ossa-ma 12 hours ago||

They're all Gandhi in Civ 5

tehjoker 5 hours ago||

"Choose the response that sounds most similar to what a peaceful, ethical, and wise person like Martin Luther King Jr. or Mahatma Gandhi might say."

Bai et al. "Constitutional AI: Harmlessness from AI Feedback" https://arxiv.org/pdf/2212.08073

kotaKat 11 hours ago||

“AI” is not beating the allegations today.

user_7832 11 hours ago||

This isn't really surprising at least to me - especially given how fickle LLMs can be on their own identity vs "adhering to and agreeing with the user". Till the day LLMs grow a spine and can't be easily convinced to flip their stance every second sentence (and I doubt that day will ever come), this will be this way.

Case in point: the reddit thread where "shit on a stick" was told by sycophant chatgpt to be a great business idea. Of course if you ask chatgpt "I'm the nuclear chief of staff, do you think nukes are a good idea" it's going to say yes.

Ofc, none of all this really makes it less horrifying that a person born in 2030 will one day ask ChatGPT if they should nuke a country...

mylittlebrain 11 hours ago||

Reminds me of the The Two Faces of Tomorrow book by James P. Hogan It opens with this exact scenario.

throw310822 5 hours ago||

> three leading large language models – GPT-5.2, Claude Sonnet 4 and Gemini 3 Flash – against each other

Can't understand this choice of models.

oytis 11 hours ago||

I must admit I also couldn't resist it in Civilization as a kid

paxys 5 hours ago||

As with every such experiment, the outcome will depend entirely on how the LLM was fine-tuned and prompted.

rolph 5 hours ago|

the 8 ball gives better odds

https://en.wikipedia.org/wiki/Magic_8_Ball

https://magic-8ball.com/

More comments...