System Card: Claude Mythos Preview [pdf]

Posted by be7a 6 hours ago

System Card: Claude Mythos Preview [pdf](www-cdn.anthropic.com)

Related: Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121

Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

470 points | 333 commentspage 4

Stevvo 6 hours ago|

"Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available."

Disappointing that AGI will be for the powerful only. We are heading for an AI dystopia of Sci-Fi novels.

girvo 4 hours ago||

Not surprising though, this was always going to be the end result within our current systems I think. When you add up: scaling power and required cost, then how talent concentrates in our economic systems, we were always going to end up with monopolies I think

Unless governments nationalise the companies involved, but then there’s no way our governments of today give this power out to the masses either.

gom_jabbar 3 hours ago||

Expected outcome. Nick Land and the CCRU have explored how capitalism operationalizes science fiction (distilled in the concept of Hyperstition). Viewed through this lens, prices encode "distributed SF narratives." [0]

[0] Nick Land (1995). No Future in Fanged Noumena: Collected Writings 1987-2007, Urbanomic, p. 396.

awestroke 6 hours ago||

I predict they will release it as soon as Opus 4.6 is no longer in the lead. They can't afford to fall behind. And they won't be able to make a model that is intelligent in every way except cybersecurity, because that would decrease general coding and SWE ability

chippiewill 6 hours ago|

Alternatively they'll just wreck it down a bit so it beats a competitor but isn't unsafe.

juleiie 5 hours ago||

Honestly if that was some kind of research paper, it would be wholly insufficient to support any safety thesis.

They even admit:

"[...]our overall conclusion is that catastrophic risks remain low. This determination involves judgment calls. The model is demonstrating high levels of capability and saturates many of our most concrete, objectively-scored evaluations, leaving us with approaches that involve more fundamental uncertainty, such as examining trends in performance for acceleration (highly noisy and backward-looking) and collecting reports about model strengths and weaknesses from internal users (inherently subjective, and not necessarily reliable)."

Is this not just an admission of defeat?

After reading this paper I don't know if the model is safe or not, just some guesses, yet for some reason catastrophic risks remain low.

And this is for just an LLM after all, very big but no persistent memory or continuous learning. Imagine an actual AI that improves itself every day from experience. It would be impossible to have a slightest clue about its safety, not even this nebulous statement we have here.

Any sort of such future architecture model would be essentially Russian roulette with amount of bullets decided by initial alignment efforts.

LoganDark 6 hours ago||

> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

Shame. Back to business as usual then.

Tepix 6 hours ago|

I for one applaud them for being cautious.

LoganDark 5 hours ago||

Being cautious is fine. Farming hype around something that may as well not exist for us should be discouraged. I do appreciate the research outputs.

Archit3ch 3 hours ago||

Don't worry, in 6-8 months the open models will catch up. Or I guess _do_ worry? ;)

FergusArgyll 2 hours ago||

"Deep learning is hitting a wall"

therealdeal2020 4 hours ago||

is it just hype building or real? I don't care, shut up and take my money haha

vonneumannstan 6 hours ago||

Are you guys ready for the bifurcation when the top models are prohibitively expensive to normal users? If your AI budget $2000+ a month? Or are you going to be part of the permanent free tier underclass?

adi_kurian 6 hours ago||

If one is to believe the API prices are reasonable representation of non subsidized "real world pricing" (with model training being the big exception), then the models are getting cheaper over time. GPT 4.5 was $150.00 / 1M tokens IIRC. GPT o1-pro was $600 / 1M tokens.

vonneumannstan 5 hours ago||

You can check the hardware costs for self hosting a high end open source model and compare that to the tiers available from the big providers. Pretty hard to believe its not massively subsidized. 2 years of Claude Max costs you 2,400. There is no hardware/model combination that gets you close to that price for that level of performance.

adi_kurian 5 hours ago||

Yes that's why I said API price. I once used the API like I use my subscription and it was an eye watering bill. More than that 2 year price in... a very short amount of time. With no automations/openclaw.

OsrsNeedsf2P 5 hours ago|||

Inference for the same results has been dropping 10x year over year[0]

[0] https://ziva.sh/blogs/llm-pricing-decline-analysis

ceejayoz 5 hours ago||

Sure, but "the same results" will rapidly become unacceptable results if much better results are available.

hibikir 5 hours ago|||

When we go with any other good in the economy, price is always relevant: After all, the price is a key part of any offering. There are $80-100k workstations out there, but most of us don't buy them, because the extra capabilities just aren't worth it vs, say a $3000 computer, and or even a $500 one. Do I need a top specialist to consult for a stomachache, at $1000 a visit? Definitely not at first.

There's a practical difference to how much better certain kinds of results can be. We already see coding harnesses offloading simple things to simpler models because they are accurate enough. Other things dropped straight to normal programs, because they are that much more efficient than letting the LLM do all the things.

There will always be problems where money is basically irrelevant, and a model that costs tens of thousand dollars of compute per answer is seen as a great investment, but as long as there's a big price difference, in most questions, price and time to results are key features that cannot be ignored.

swader999 5 hours ago||||

Yes, it will always be an arms race game.

esafak 5 hours ago|||

Or will they rapidly become indistinguishable since they both get the job done?

asadm 4 hours ago||

if it can pay my rent, why not?

jdthedisciple 5 hours ago||

Opus 4.6 is already incredible so this leap is huge.

Although, amusingly, today Opus told me that the string 'emerge' is not going to match 'emergency' by using `LIKE '%emerge%'` in Sqlite

Moment of disappointment. Otherwise great.

bornfreddy 5 hours ago||

I only have 3 points against LLMs: they lack reason and they can't count.

FeepingCreature 5 hours ago||

'emer ge' is two tokens, 'emergency' is one. The models think in a logosyllabic language.

kypro 4 hours ago||

While we still have months to a year or two left, I will once again remind people that it's not too late to change our current trajectory.

You are not "anti-progress" to not want this future we are building, as you are not "anti-progress" for not wanting your kids to grow up on smart phones and social media.

We should remember that not all technology is net-good for humanity, and this technology in particular poses us significant risks as a global civilisation, and frankly as humans with aspirations for how our future, and that of our kids, should be.

Increasingly, from here, we have to assume some absurd things for this experiment we are running to go well.

Specifically, we must assume that:

- AI models, regardless of future advancements, will always be fundamentally incapable of causing significant real-world harms like hacking into key life-sustaining infrastructure such as power plants or developing super viruses.

- They are or will be capable of harms, but SOTA AI labs perfectly align all of them so that they only hack into "the bad guys" power plants and kill "the bad guys".

- They are capable of harms and cannot be reliably aligned, but Anthropic et al restricts access to the models enough that only select governments and individuals can access them, these individuals can all be trusted and models never leak.

- They are capable of harms, cannot be reliably aligned, but the models never seek to break out of their sandbox and do things the select trusted governments and individuals don't want.

I'm not sure I'm willing to bet on any of the above personally. It sounds radical right now, but I think we should consider nuking any data centers which continue allowing for the training of these AI models rather than continue to play game of Russian roulette.

If you disagree, please understand when you realise I'm right it will be too late for and your family. Your fates at that point will be in the hands of the good will of the AI models, and governments/individuals who have access to them. For now, you can say, "no, this is quite enough".

This sounds doomer and extreme, but if you play out the paths in your head from here you will find very few will end in a good result. Perhaps if we're lucky we will all just be more or less unemployable and fully dependant on private companies and the government for our incomes.

CamperBob2 4 hours ago|

If you disagree, please understand when you realise I'm right it will be too late for and your family.

Funny, I was about to say the same thing to you! Life is full of little coincidences.

More comments...