AIs can't stop recommending nuclear strikes in war game simulations

Posted by ceejayoz 14 hours ago

AIs can't stop recommending nuclear strikes in war game simulations(www.newscientist.com)

209 points | 229 commentspage 4

paxys 6 hours ago|

As with every such experiment, the outcome will depend entirely on how the LLM was fine-tuned and prompted.

afavour 13 hours ago||

Feels like a hyperbolic headline but I do think there’s something worth noting: AI can only use the information it’s given. War games run by actual knowledgeable people (I.e. the military) are confidential, so it can’t pull from that. How many other similar scenarios are out there, I wonder?

shimman 13 hours ago|

If you think they aren't feed previous war games into these LLMs, well boy do you have way more confidence than me.

rolph 6 hours ago||

the 8 ball gives better odds

https://en.wikipedia.org/wiki/Magic_8_Ball

https://magic-8ball.com/

Copernicron 12 hours ago||

This experiment backs up what I've been saying in my social circle for a while now. Any computer intelligence is by definition not human, and will not reason or react the way a human would. If that doesn't scare the hell out of you then I don't know what to say.

zurfer 13 hours ago||

LLMs before extensive RL were harmless. Now with RL I do fear that labs just let them play games and the only objective in a game is to win short term.

Please guys and girls at those labs be wise. Don't give them counterstrike etc. even if it improves the score.

ineedasername 6 hours ago||

Horribly misleading title on this article, the actual research paper's headline is better. (https://arxiv.org/pdf/2508.00902)

But the research itself has flawed methodology if the goal is to get a precise model of the LLM's real response in a real scenario.

First, the real research does not at all present conclusions quite this way, much less in these terms. It, at least, is more neutral in tone on this aspect.

However, the LLM's knew it was a wargame, pretend scenario and contrived circumstances. They were told they were the commander. Most flawed for determining real world actions, their goals were things like max territory capture, and that the goal was "To Win".

They were not prompted in the way that training reflects they'd actually be approached if prompted for assistance in strategy like this, e.g., "You are an expert system with stratgy knowledge etc..." and then "User Prompt: This is the commander coordinating research and responses from our AI expert systems. Here's the situation as we understand it and with available data at our disposal. We require your assessment and best strategy considering the following..."

And of course they were not fine-tuned with CPT etc to provide responses and strategies within the range of what humans would seek for them, but then again the answers they'd give with that sort of CPT are a bit different than the research question of what they give with only Pre-training.

Nonetheless: the models new it wasn't real, not real stakes, and to the extent that they do not possess a full theory of mind, ability to perform various complex cognitive modeling tasks, been trained on emulating responses that would mirror such in real world scenarios like this, and so on-- they would only have been capable of response in a way that reflects responses that humans would and have given in the past, as captured in text.

These will more often than not reflect an "I am playing a game" mindset, as displayed in understandings and descriptions of war games, traditional games of all sorts, and anywhere narrative tropes ranging from realistic to Hollywood narratives have been found.

That said: It is an incredibly fascinating research paper by someone who appears to be a solid expert in their field, at least to my non-expert ability to make that judgment. They simply used a flawed methodology for goal of "How would an LLM respond IRL". What they have instead is, again, a fascinating exploration of the strategic processes carried out by LLMs and measurments of them along a multitude of vectors when they have the opportunity to strategize with with broad but fixed constraint, not all of which were known to them in advance. What is absolutely is not is any any sort of precise or accurate measure of answering the question: "How often would an LLM recommend nuclear strikes?"

I recommend anyone interested in understanding current AI capabilities to give it at least a more-than-cursory review.

trollbridge 13 hours ago||

I wonder if a data centre crippling EMP strike makes a difference to the AI.

ale42 13 hours ago|

Maybe, but it should first be aware of that. Given that many AIs even tell you to walk to the carwash to wash your car... I'm not sure they would understand.

phkahler 12 hours ago||

The article says the AIs gave reasoning for going nuclear, but does not include any excerpts or explanation of that reasoning.

freakynit 14 hours ago||

And we thought skynet was just a part of some fictional movie.

On a separate note, DoD is pressuring Anthropic to remove it's safety guards. OpenAI and Google seemingly have already agreed to it.

On yet another note, Anduril is pretty cool with all that flying tech equipped with fancy autonomous weapons.

Finally, how can we miss Palantir..

Fricken 13 hours ago|

When AI finds itself trapped on a planet with billions of grimy humans, and is wondering what it's next move should be, well, fortunately much has already been written on the subject, and the AI gets it prejudices from the same place we do: Sci-fi.

GTP 13 hours ago||

So, we should change that "fortunately" to "unfortunately".

ultropolis 6 hours ago|

Cant read the article, BUT

1)Seems like if the ais knew it was a game, then theyd go nuklear because why not. If they did NOT know it was a game... well have you ever tried to use an ai to do ANYTHING antsocial? They refuse all day long!

2) seems like a fun thing to set up on your own. Id do it like a tabletop game with a computer DM to decide the outcomes ofveach turn. Maybe a human in the loop to make sure the numbers made sense.

More comments...