Top
Best
New

Posted by be7a 10 hours ago

System Card: Claude Mythos Preview [pdf](www-cdn.anthropic.com)
Related: Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121

Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

556 points | 407 commentspage 6
simianwords 10 hours ago|
> We also saw scattered positive reports of resilience to wrong conclusions from subagents that would have caused problems with earlier models, but where the top-level Claude Mythos Preview (which is directing the subagents) successfully follows up with its subagents until it is justifiably confident in its overall results.

This is pretty cool! Does it happen at the moment?

jdthedisciple 8 hours ago||
Opus 4.6 is already incredible so this leap is huge.

Although, amusingly, today Opus told me that the string 'emerge' is not going to match 'emergency' by using `LIKE '%emerge%'` in Sqlite

Moment of disappointment. Otherwise great.

bornfreddy 8 hours ago||
I only have 3 points against LLMs: they lack reason and they can't count.
FeepingCreature 8 hours ago||
'emer ge' is two tokens, 'emergency' is one. The models think in a logosyllabic language.
atlgator 6 hours ago||
[flagged]
dang 5 hours ago|
We're getting complaints that you're posting generated comments to HN. That's not allowed here, so can you please not? See https://news.ycombinator.com/newsguidelines.html#generated and https://news.ycombinator.com/item?id=47340079

(If this is a wrong guess, I apologize - it's impossible to be sure)

sheeshkebab 5 hours ago||
Again, wake me up when it can do laundry.
dwaltrip 3 hours ago|
Time to wake up:

π*0.6: two and a half hours of unseen folding laundry (Physical Intelligence)

https://www.youtube.com/watch?v=ZpHapIlJnMo

throw310822 2 hours ago||
Looks like the first two hours were spent trying to fold the same t-shirt :)
FergusArgyll 5 hours ago||
"Deep learning is hitting a wall"
chonle 2 hours ago||
[flagged]
kass34 2 hours ago||
[dead]
minutesmith 7 hours ago||
[flagged]
robstertalk 3 hours ago||
[flagged]
minutesmith 8 hours ago|
[flagged]
More comments...