Top
Best
New

Posted by Ryan5453 7 hours ago

Project Glasswing: Securing critical software for the AI era(www.anthropic.com)
Related: Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258

Also: Anthropic's Project Glasswing sounds necessary to me - https://news.ycombinator.com/item?id=47681241

754 points | 325 commentspage 4
zambelli 3 hours ago|
I'm glad to see that it stands its ground more than other models - which is a genuinely useful trait for an assistant. Both on technical and emotional topics.
MisterBiggs 3 hours ago||
What happens once an agent can reliably get 100% on swebench?
caycep 3 hours ago||
When do we get our Kuang Grade Mark Eleven icebreaker?
anVlad11 6 hours ago||
So, $100B+ valuation companies get essentially free access to the frontier tools with disabled guardrails to safely red team their commercial offerings, while we get "i won't do that for you, even against your own infrastructure with full authorization" for $200/month. Uh-huh.
SheinhardtWigCo 5 hours ago||
Yes, and that's normal. Coordinated disclosure is standard practice when the risk of public disclosure is unacceptable.
charcircuit 2 hours ago||
Risk for who? It feels unfair that the risk to myself is ignored "for the greater good of everyone else."
unethical_ban 6 hours ago||
I'm sympathetic to your point, but I'm sure there are heightened trust levels between the participating orgs and confidentiality agreements out the wazoo.

How does public Claude know you have "full authorization" against your own infra? That you're using the tools on your own infra? Unless they produce a front-end that does package signing and detects you own the code you're evaluating.

What has it stopped you from doing?

9cb14c1ec0 5 hours ago||
You can do pretty much anything you want with public claude if you self-report to it that you have been properly authorized.
manbash 3 hours ago||
This will likely not see the light of day. It's the usual PR that gathers many "partnerships".

Expect to see lots of these in the upcoming months as the big companies scramble to keep from losing money.

copypaper 2 hours ago||
Yea, but can it secure systems from the unpatchable $5 wrench vulnerability?

https://xkcd.com/538/

kristofferR 3 hours ago||
This is pretty insane. A model so powerful they felt that releasing it would create a netsec tsunami if released publicly. AGI isn't here yet, but we don't need to get there for massive societal effects. How long will they hold off, especially as competitors are getting closer to their releases of equally powerful models?
charcircuit 2 hours ago|
OpenAI did the same thing with GPT3 trying to scare people into thinking it would end the internet. OpenAI even reached out to someone who reproduced a weaker version of GPT3 and convinced him to change his mind about releasing it publicly due to how much "harm" it would cause.

These claims of how much harm the models will cause is always overblown.

throwaway13337 5 hours ago||
I really wanted to like anthropic. They seem the most moral, for real.

But at the core of anthropic seems to be the idea that they must protect humans from themselves.

They advocate government regulations of private open model use. They want to centralize the holding of this power and ban those that aren't in the club from use.

They, like most tech companies, seem to lack the idea that individual self-determination is important. Maybe the most important thing.

dralley 4 hours ago|
That is unequivocally true with some things. You don't want people exercising their "self-determination" to own private nukes.
throwaway13337 4 hours ago||
LLMs aren't nukes.

They're more like printing presses or engines. A great potential for production and destruction.

At their invention, I'm sure some people wanted to ensure only their friends got that kind of power too.

I wonder the world we would live in if they got their way.

picafrost 6 hours ago|
> Anthropic has also been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. [...] We are ready to work with local, state, and federal representatives to assist in these tasks.

As Iran engages in a cyber attack campaign [1] today the timing of this release seems poignant. A direct challenge to their supply chain risk designation.

[1] https://www.cisa.gov/news-events/cybersecurity-advisories/aa...

More comments...