AI Built a Nuke and Still Lost

Posted by kensai 19 hours ago

AI Built a Nuke and Still Lost(www.lwilko.com)

86 points | 93 commentspage 2

NoLinkToMe 17 hours ago|

Quite annoying to have to read a paragraph of text next to a moving image. I right-clicked every GIF and turned off 'loop'.

Beyond that reading an AI piece just feels like a waste of time. The text goes on and on without making a point, or getting to an actual learning. It just delineates the AI's limitations, doesn't go into whether these can be fixed, are innate, or what conclusions you can draw from it, over and over with example after example but no point.

Mostly it seems to keep repeating that the AI has the correct analysis but just doesn't execute. The AI knows to build X and logs this in each of its turns, yet doesn't build it. It's like there's some API connection missing between analysis and execution, and turns this into a 10 page article.

The article ends with some weird question to the AI asking if it enjoys the games, and you get some quasi-scifi mumbo jumbo answer back that looks very profound to say my mom, but is just silly to post if you know what the LLM is doing: predicting the next word. Honestly this is a poor article and I wish it wasn't posted.

davedx 16 hours ago||

The way it failed to maintain its strategy, or even its build plans, makes me wonder if this is something that could be solved via the attention mechanism itself?

Instead of only using attention to focus on the previous token position, could it also do some kind of higher order "temporal attention" planning where it weighs each previous log (game state + intent) checkpoint when generating outputs?

j5dgx76 18 hours ago||

> Tony Blair Institute

Okay carry on.

BoxOfRain 18 hours ago||

There's something so uncanny about the mismatch between the regard in which Blair is generally held by British people and the regard in which he seems to hold himself.

If I were him I'd have retired from public life and kept a very low profile after Iraq, and everything else for that matter. He doesn't seem to realise that his modern interventions alienate everyone, even Alastair Campbell of all people seemed uncomfortable to the degree he seems to uncritically sing the praises of people like Larry Ellison recently.

orthoxerox 18 hours ago||

Chumbawamba made me unable to take anything associated with him seriously.

petesergeant 18 hours ago||

He was arguably the most successful UK PM of the last 50 years.

pjc50 17 hours ago|||

I think I could agree with that, until the Iraq war.

petesergeant 17 hours ago||

A pretty huge stain on what’s otherwise an exceptional record, though.

Obscurity4340 17 hours ago|||

By what metric(s)?

ahartmetz 17 hours ago|||

I could see him winning at personal financial success

petesergeant 17 hours ago|||

Are you being obtuse, or you genuinely don’t know?

ForHackernews 18 hours ago||

Kind of grim that this level of analysis is informing UK government policy. Repeatedly, the AI doesn't have the information or access needed through his hacky vibe-coded MCP, and instead of abandoning his flawed artificial test scenario (or fixing it — finding or building a better one) he gives it a name "The sensorium effect" and treats this as some brilliant insight.

Both humans and AI struggle to make sound choices when presented with incomplete or misleading information. This is not a new revelation: https://en.wikipedia.org/wiki/There_are_unknown_unknowns

NoLinkToMe 17 hours ago||

Exactly this, he should've just fixed this, or not written an article about it.

After the 'sensorium effect' (he should've used ancient greek for a +10 bonus to archaic intellectual points), he describes the 'knowledge-doing gap'. i.e. the AI reasons it needs to build X, logs this for 110 turns in a row, but doesn't do it. It doesn't actually specify why not, and whether it is again a limitation of his MCP implementation. If the AI articulates it must do it like the author says, but decides not to, either it doesn't think it must do it, or it does think it must but somehow can't technically execute its own decisions, it can't be anything else.

In fact in the context of 'advising the UK government', this 'knowledge-doing gap' I assume is a technical limitation, is entirely moot. For the cost of 0.00001% of the UK's government you could just hire a human being to execute that which the AI articulates. I'm curious what the results would be if he just did a manual execution of the AI's articulated actions would be.

The fact he doesn't go in to this but just keeps repeating examples of this makes it a pointless article.

pjc50 18 hours ago|||

> he gives it a name "The sensorium effect" and treats this as some brilliant insight

And of course is unaware of prior work in this area!

https://en.wikipedia.org/wiki/Seeing_Like_a_State / https://en.wikipedia.org/wiki/Project_Cybersyn

raincole 17 hours ago||

> he gives it a name

It gives it a name. It would be quite surprising if he bothered to come up with this name himself when the whole article is obviously AI written.

teekert 18 hours ago||

Well, the weird thing with nukes is that deterrence only works if you are 100% ready to use them. When the time comes though it would certainly be nice if it turned out to be below 100%.

What is winning? Are we a collective or are we individuals?

Likely the AI did not get the assignment That "Whatever happens, humans as a race must survive."

throwawayqqq11 17 hours ago|

Im sure there are some billionaires to find, that finally care about the survival of the white race. /s

teekert 16 hours ago||

Probably

    [f"I'm sure there are some {race} billionaires to find, that finally care about the survival of the {race}." for race in all_races]

voidUpdate 18 hours ago||

Well this looks like a perfect example of why an LLM should never make any governmental decisions ever

darkwi11ow 16 hours ago||

LLMs are really bad at abstract strategy games like chess, go or civilization. Their ability to excel at broad reasoning is what is limiting them in games that have narrow rule-sets but steep learning curve.

dspillett 17 hours ago||

Did no one think of offering it a nice game of chess?

StrauXX 18 hours ago||

This reads to me mostly like the MCP server has many bugs, rather than inherent model weaknesses.

anygivnthursday 18 hours ago|

I have a hard time reading slop, but I like the game and wanted to know how it worked, so fought my way through, only skipped the very last part. The issue the author calls out is classic Claude (I dont really use other LLMs to compare), probably all of us experienced using Claude Code when it gets so focused on one thing it misses the forest for the tree. It happens often, even if it does verify something and it shows something is wrong, it sometimes rationalizes it and explains it away when it does not fit its model.

More comments...