Feds freaked over Fable 5 after 'fix this code', not jailbreak, say researchers

Posted by _tk_ 5 days ago

Feds freaked over Fable 5 after 'fix this code', not jailbreak, say researchers(www.theregister.com)

610 points | 360 commentspage 6

smrtinsert 4 days ago|

I can only imagine the unintended consequence of this whole fiasco will be for frontier providers to not provide future "warnings" about model capabilities in order to de risk earnings

AndrewKemendo 4 days ago||

I’m still not buying that this was an actual USG order. The only people commenting are “experts” and there has been no official announcement from the USG.

This doesn’t smell like a NSL and there’s no process to selectively “export control” something like this.

Even so there’s a dozen mechanisms through courts to challenge this, and Anthropic isn’t taking any of them.

I think this is a made up crisis for PR with no actual legal requirements behind it.

> On Friday, the US government, reportedly citing national security concerns, issued an export control directive to suspend access to Fable 5 and Mythos 5 by any foreign national, inside or outside the United States. In response, Anthropic disabled both models “for all our customers to ensure compliance.”

smallerize 4 days ago|

David Sacks is on the record confirming it. https://www.tomshardware.com/tech-industry/artificial-intell...

AndrewKemendo 4 days ago||

[dead]

phendrenad2 4 days ago||

So, they gave Fable a codebase full of exploits and said "fix this code", and it fixed the code?

Sounds like they freaked out because Fable is too good at finding NSA backdoors?

tiborsaas 4 days ago||

What if everybody on the internet starts running "fix this code"?

https://xkcd.com/810/

htrp 4 days ago||

If fix this code gets by the guardrails, they are effectively using rules based classifiers (or llm as a judge on the prompt)

cwoolfe 4 days ago||

Cyber defense and offense are the same security research skillset. Not sure anybody could really untangle that.

catigula 4 days ago||

>“The behavior described in the paper cannot meaningfully be fixed, and any attempt would only weaken the model for defense,” said Moussouris, who criticized the export control directive as hasty, heavy-handed, and misguided.

This literally means the models are too dangerous to release, and yet he and they reached the opposite conclusion.

A lot of people have been saying this repeatedly for a long time.

switchbak 4 days ago||

Or perhaps: we don't want our adversaries fixing all the security holes we rely on.

Or even: this is a good chance to stick it back to Anthropic.

ceejayoz 4 days ago|||

> This literally means the models are too dangerous to release…

Unless you believe Anthropic has an irreplacable wizard or genie or fairy chained up somewhere that other providers can't replicate, someone is going to release such a thing, and that someone might be a lot more cavalier about the safety of it.

catigula 4 days ago||

Yes, this is the flawed logic Anthropic is using to do dangerous things; it's not lost on anyone.

ceejayoz 4 days ago||

What's flawed about the logic?

Are we gonna drone strike China's datacenters when they release a similar model?

kylemaxwell 4 days ago||

Mousssouris is not a "he".

catigula 4 days ago||

cryptonector 4 days ago||

I've had to convince ChatGPT that code is mine before it would do a security review.

malyk 4 days ago|

Yes, I ran into the same problem last week. But I just said "this is my code in a private repo" and then it just went and did what I asked without question.

davesque 4 days ago||

Kind of highlights how ridiculous their notion of safety is in this case. By this measure, I guess making the model "safe" means making it play dumb and intentionally ignore security bugs that it notices in the code? And what will the eventual legality of this look like? "Yes, your honor, we allege that this AI system that was sold to us willingly and knowingly ignored a critical security vulnerability in our software system, thereby leading us to be hacked and causing our business to fold."

It's exactly the same problem as backdoors in crypto systems. Criminals will find the crypto that isn't broken and use it regardless (or make it for themselves), while the rest of us losers are stuck with the broken version that we're allowed to use.

On this issue of cyber security, it seems better if authorities just start acting like the cat is out of the bag instead of pretending like it isn't. ASI is basically here now, so what are we going to do about it? Let's not bother pretending otherwise.

On another note, I doubt this was anything other than a vindictive administration enacting revenge on a party that refused them. We all know the Trump admin's priorities.

smasher164 4 days ago|

Honestly, given how trivial it is for mythos-class models to identify an exploit, I’m going to assume any sufficiently large project written in C, C++, or Zig is riddled with latent vulnerabilities and compromised.

More comments...