Top
Best
New

Posted by thm 1 day ago

NSA lost access to Mythos amid Anthropic dispute(www.nytimes.com)
Unlocked: https://www.nytimes.com/2026/06/23/us/politics/nsa-lost-acce...
262 points | 283 comments
jawiggins 1 day ago|
> The White House and intelligence officials had pushed forward a classified contract between Anthropic and the N.S.A., which would allow the spy agency to use the company’s technology for a variety of purposes, including intelligence analysis and detecting new computer vulnerabilities.

Ironic that both sides are playing a horse shoe game:

Gov: The model is both a supply chain risk and also we'll DPA you if you don't give it to us.

Anthropic: The model is both like a nuclear weapon in terms of national security implications and safe for general release.

anshumankmr 20 hours ago|
I mean graphite control rods do exist in nuclear reactors to absorb excess neutrons, preventing the fuel from going critical & making it technically safe for general use (THOUGH of course disasters have happened)
chasil 1 day ago||
'Mythos “broke into almost all of our classified systems, not in weeks, but in hours.”'

Is Mythos a significant danger?

The curl experience does not suggest that hysteria is warranted, but this gives me pause.

maxall4 1 day ago||
Or, alternatively, it may suggest that the NSA’s classified systems are not very secure, which seems at least as possible: they may rely on requiring physical access to these systems to even attempt to penetrate them.
nl 1 day ago|||
Curl is such a small utility, and the effect of any single problem is limited.

Mythos's great strength was finding multiple vulnerabilities and chaining them together to break a whole system.

Look at it like this: It found one confirmed, minor vulnerability in Curl (but I don't think they have said what it was?). In another system that used Curl it's possible it could have exploited that vulnerability to chain to another, bigger vulnerability that was normally inaccessible.

That's how systems get broken.

prirun 12 hours ago|||
'Mythos “broke into almost all of our classified systems, not in weeks, but in hours.”'

And the government's response was to limit access to US citizens? I don't believe this for a minute. If Mythos could actually break into all these systems, the government would declare it a national security risk and it would never see the light of day for anyone outside government staff with security clearance.

mos_basik 8 hours ago|||
additional context from the article regarding that particular statement:

"[the statement] was oversimplified... In reality, the tests involved “red teams” of N.S.A. analysts who were using Mythos in a highly tailored environment that would be extremely unlikely for an adversary to replicate, officials said. The red teams began their tests within classified N.S.A. systems designed to be accessible only from certain computers and completely cut off from the broader internet.

The tests found that Mythos was able to identify cybersecurity flaws within that classified network quickly, but it did not actually break into those systems, the officials said."

JKCalhoun 12 hours ago|||
Why are these things online at all? Is that a requirement for them to be useful?
enraged_camel 1 day ago||
>> The curl experience does not suggest that hysteria is warranted, but this gives me pause.

What about the Firefox experience?

Or are we conveniently ignoring things that don't confirm conclusions we've already reached?

chasil 1 day ago|||
I'm not as familiar with that. I do agree that it sounded substantial.

I just think that a coreutils flaw is not as substantial as a rendering engine exploit.

readthenotes1 1 day ago||
Hadn't they spent a year hardening curl with various AI before they tried Mythos?
fc417fc802 1 day ago|||
Yes. The original curl post didn't say anything like "mythos sucks" but rather "it's only a minor improvement in comparison to already widely used models".
Chu4eeno 20 hours ago|||
Yes, and Firefox had not.

Which I think points at Mythos not being some big jump in capability finding things earlier LLMs didn't, it seems to mostly come down to massively increased compute budget and they finally catching up in context sizes.

ai_fry_ur_brain 1 day ago|||
Aren't you trying to do the same thing. Llm people, you're cooked.
teravor 1 day ago||
mythos allowed mediocre people to get results by holding their hand through the process, or just ignoring their irrelevant input and knowing what to do.

if you throw millions of tokens at IDA Pro MCP with the right prompt lets just say security by obscurity fails miserably because there is no obscurity when the LLM chews through the decompilation.

baq 1 day ago||
It isn’t bad, it isn’t good. It’s just how the world looks now. All software is open source now, some of it is just more open, some of it is less.
virtualritz 1 day ago|||
> mythos allowed mediocre people to get results by holding their hand through the process,

Yes, just like early cars allowed mediocre horse riders to get from A to B with dignity.

Or like my Japanese rice cooker allows a person like me, utterly shitty at preparing this, to eat some rice that is cooked to perfection.

Etc.

greggsy 1 day ago|||
I mean, the calculator is my go to analogy I keep bringing up in this debate.

It lets someone with mediocre long division skills to just do the thing they need to do with fewer steps and less friction.

IDA itself is a tool that helps you decompile code without having to do a lot of things.

teravor 1 day ago||
knowing long division does not help you make the calculator do division better.
dlmanning 1 day ago||
Understanding math absolutely makes a calculator more useful to you though.
teravor 1 day ago||
and if you work that into the parent's analogy you get the point I was making
dlmanning 1 day ago||
Apologies. I misread the comment to which you replied and gave them unwarranted credit for not making the same tired point about calculators.
ai_fry_ur_brain 1 day ago|||
Should mediocre people be preforming heart surgery?
garyfirestorm 1 day ago|||
It depends. Mediocre doctor in a remote area with right tooling assistance as opposed to no one being available for someone who urgently needs one? Yeah this should be a thing. Should a software bro in NY perform it in dark alley despite having best doctors few blocks away? Maybe not…
p-e-w 1 day ago||||
Lots of mediocre people already are.
gaiagraphia 1 day ago||||
I'm sure many 'mediocre' people perform heart surgery. Only 100 years ago, the idea of a person without a certain surname or race, would've been a ghastly preposition, no?
dlmanning 1 day ago||
Do you... think heart surgery has become LESS dependent on surgical skill in the last 100 years? Cardiovascular surgeons spend MORE time in training now than they did 100 years ago.
gaiagraphia 1 day ago||
Did heart surgery as we know it exist 100 years ago, or are you trying to conflate things to make a point?

"heart surgery" isn't a technique". Name something, literally anything connected to the profession, and tell me whether the training time is naturally bound to keep going up and up.

sieabahlpark 1 day ago|||
[dead]
gaiagraphia 1 day ago|||
"mediocre people"

I'm glad to see the mask is falling off the privileged caste.

Is there anything inherently wrong about open access to tools? (Apart from rent payments).

dlmanning 1 day ago|||
The "privileged caste" being people who actually expended the effort to learn things for themselves?
gaiagraphia 1 day ago||
And such people learnt everything from the beginning? From fire?

Where's the cut off point of where learning something for yourself becomes the signal for entrance to the enlightened caste?

dlmanning 1 day ago||
It's the point where you expend effort to learn a useful skill.
djhn 22 hours ago|||
> (Apart from rent payments).

Privilege enables you to rent competence, historically by paying other people. The slop companies will now sell you a simulacrum of competence by the token.

The fact that competence can (could?) only be acquired through sustained effort over a long period of time is (was?) levelling the field.

Selling simulated competence perpetuates privilege, instead of dismantling it like you seem to claim.

joe_mamba 1 day ago||
>mythos allowed mediocre people to get results by holding their hand through the process

Isn't this what technology progress looks like? Industrial tools allowed mediocre people to improve their productivity by orders of magnitude which is how we managed(in the past) to build so many amazing things with less human toil and suffering than previous generations.

imdsm 1 day ago||
Progress isn't always welcome by the incumbent who have built their moats on hoarding knowledge over being adaptable
losteric 1 day ago|||
It seems like AI is really hurting the people who don't have a hoard of experience - the juniors and early mid-level tech people.

The incumbents with experience are doing amazing. PM's with Mythos aren't replacing the PE with 20 years of experiences lol.

pixl97 1 day ago||
I mean that is what most technology looked like at first too.
dlmanning 1 day ago||
Oh okay. So where's the point where AI starts to encourage the development of a new useful skill set among people early in their careers?
interstice 18 hours ago||||
Are you saying programmers aren't adaptable? I don't think I've ever seen this field pause to take a breath.
dlmanning 1 day ago|||
Not all of us think encouraging people to outsource their own thinking to proprietary models is actually "progress."
gaiagraphia 1 day ago||
Is there a historical precedent as to what happened when the upstart denied capability to the empire?

The closest I can think of is the bronze age collapse.

sawjet 1 day ago||
There is no consensus on what caused the bronze age collapse.
dwheeler 1 day ago|||
Perhaps, but I think volcanic eruption followed by system collapse is very compelling. Here is the story I find most convincing from the experts whose works I have read.

It likely started with a volcanic eruption, leading to widespread famine. Those in western Europe who didn't want to starve migrated en masse, as whole families, becoming the sea peoples. The powerful empires struggled to feed their people, and many were destroyed by the forced migration from the sea peoples. Egypt barely survived, but only as a shadow of itself. Many of the others were destroyed by those who had survived on marginal lands and didn't need complex societies to keep themselves fed.

Iron can't be the cause, as iron weapons pre-existed the Bronze Age collapse. I think the evidence is stronger that the collapse forced widespread adoption. The collapse devastated long-distance trade networks, which cut off the supplies of tin needed to make bronze. The scarcity pushed people to rapidly improve iron smelting.

I'm not a professional historian, but I do find the topic interesting. We should try to learn from past disasters to prevent repetition.

See Eric H. Cline's "1177 B.C.: The Year Civilization Collapsed";

Epimethius video "What was life like after the bronze age collapse (extended version)" https://www.youtube.com/watch?v=uM6JSS3l-IQ

gaiagraphia 1 day ago||
What's the time period between iron being widespread and the so-called collapse?
gaiagraphia 1 day ago||||
Sorry i forgot to ask. What are the top 5 theories, and do you see any modern parallels?
wil421 1 day ago||
Are you an AI?
gaiagraphia 1 day ago||
Try not to be. Apoologies if the link aint obvious enough.
gaiagraphia 1 day ago|||
Thenn, it makes it more riveting when modern day phenonema happen, surely?

Unless you subscribe to a historical channel?

CoastalCoder 1 day ago|||
> The closest I can think of is the bronze age collapse.

No idea about your question, but I'd love to hear more about this part.

gaiagraphia 1 day ago||
[flagged]
dofm 1 day ago||
If Mythos is still running internally, the NSA still have some access to it. It's just crazy to believe there aren't CIA and/or NSA plants (tacitly acknowledged or otherwise) inside Anthropic and OpenAI.

But Mythos is still only an advanced LLM so I am not sure what all this breathy fuss is about; it sounds like the PR war more than anything.

If the NSA aren't themselves training technologies that are at least as powerful, that would modestly surprise me.

Not that you need an LLM to monitor the risks to the USA. You just need Tulsi Gabbard's emails.

SV_BubbleTime 1 day ago|
I think it’s beyond a mastery of PR. They literally called it Mythos and built a literal myth around it. I mean… maybe people just want the soap opera.
zb3 1 day ago||
> That contract has not been finalized, and some Pentagon officials want the N.S.A. to find a way to work with other models.

Good, fsck NSA, that's the last organization I'd ever want to have access to Mythos. I hope this administration's incompetence will prevent them from regaining access for as long as possible

baq 1 day ago|
It’ll be the first organization to get access to Epic/Saga/Legend/Bible/Torah/Sutra/Vedan/whatever the Mythos+1 is called - and it might be the only one with this privilege
bb88 1 day ago|||
More likely they'll convince congress they will need their own. Only it will 20-200 times more expensive and the US taxpayer will be paying for it but won't get access.
axus 1 day ago||
That would meet the OP's goal of NSA never getting a frontier model, "behind schedule" is the natural partner of "over budget".
Computer0 1 day ago|||
They will never be able to read all the words in my head that spell out exactly what I want to have happen at that org.
Woodi 22 hours ago||
> NSA lost access to Mythos

That's is funniest thing I read since long time :)) I mean: it's so absurd, almost like things we had in real socialism in 80s :> But, yeah, freedom have consequences.

sometimelurker 14 hours ago||
honest question: does the nsa have the abilty to take the model weights from anthropic? also, as I understand it anthropic employees from the USA have mythos access, and I dont see why this shouldnt extend to the nsa. this seems pretty silly and kinda unbelievable. commenting again to add more infomaiton to my opinion and ask that you don't just blindly downvote bc I don't believe the nsa doesn't have mythos
AustinDev 1 day ago||
They could easily take the weights if they wanted. I don't believe they meaningfully lost access.
HlessClaudesman 1 day ago||
Who will make them the next set of weights?

If a government can just seize the product of someone else's labour, either they will end up as slave owners or without willing workers.

dofm 1 day ago|||
Serious question: do you think the NSA aren't training their own LLMs? (With or without Anthropic and OpenAI's help)

It's a perfect technology for their uses, they get a big chunk of a $100 billion black budget, and they've had access to the research for at least as long as we have.

xeubie 1 day ago|||
I can't say what they're doing now because I worked for the NSA 15 years ago but the view of them as an omnipotent power is a product of Hollywood. The government is good at throwing an ungodly amount of resources at something to get a result through sheer attrition, and so they are often the source of original development of technologies. The private sector has always been much better at building a technology to greater sophistication and efficiency. There may be blue badgers in Fort Meade trying to train models but there is no chance they are competitive with the frontier AI companies. It's like saying the government has an amazing home-grown fighter aircraft that is beyond what Lockheed has ever made...they delegate that stuff to private companies for a reason.
LPisGood 1 day ago||
I’ve heard of “blue suiters” for air force brass, but never blue badgers.

Anyways, isn’t NSA one of the largest employers of mathematicians in the world? Surely they’re doing something useful.

xeubie 1 day ago|||
Blue badges were for government employees (like I was), and green badges were for private contractors. And yes they have a lot of math and physics guys; my own physics lecturer was in my orientation class, actually. He was there for quantum computing, which reinforces my point. The government can be good at pioneering unproven / uncommercialized technologies, but in general they are like a blunt weapon; the profit motive and lack of bureaucracy eventually makes the private sector far better for improving the technology later. In the case of LLMs, they didn't even originate in government, and I don't think there's any chance they are being developed there at a more advanced level.
rob74 1 day ago|||
Cryptography, I guess? Not really related to LLMs...
zhoBEENG 1 day ago||
Crypto and AI are deeply connected, and you see similar structures/problems in both. Shannon, the “Father (or whatever) of AI”, worked for the NSA and published many papers there that were later declassified.

Here is a banger quote on this by Shannon’s boy Warren Weaver, keeping in mind LLMs came from translation problems:

“One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: 'This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.”

nl 1 day ago||
> Crypto and AI are deeply connected, and you see similar structures/problems in both.

I mean yes, in both deal with information theory.

That's a long way from any practical insight.

ben_w 1 day ago||||
> Serious question: do you think the NSA aren't training their own LLMs?

Given the evergreen discussion of "are these companies making a profit"*, I think any LLMs that the NSA (or any other government agency worldwide) may be making are quite far from the leading edge.

* Person A: "they are making a loss!" Person B: "Only if you count training, they make a profit on inference, look at what it costs to run comparable open models on generic cloud servers" A: "Sure, but if they don't train new models they'll be left behind, so they're still making a loss"

That and the way compute is now measured in GW, I think even random low budget vloggers just getting started would be able to spot if the NSA was doing anything significant just from the extra heat emissions or power plants getting built.

ACCount37 1 day ago||
Model training does NOT dominate the model costs.

The rate of inference compute to training compute is ~10:1, for popular frontier models. Models are routinely overtrained past the Chinchilla optimum now because it makes an immense amount of economic sense to do so.

Worse the more niche and unused your models get, but when this "making a loss" fuckery pops up, it's usually about the big guys like Anthropic, OpenAI, GDM and maybe xAI and Meta. Of which only the latter can be accused of not selling enough inference to offset the training runs.

The real money sinks are: R&D and infrastructure buildouts.

HlessClaudesman 1 day ago||||
I don't think there is much overlap between people capable of building cutting edge LLM's and the people who want to build a cutting edge LLM for the government.
dofm 1 day ago||
The NSA managed to deliberately insert a backdoor into elliptic-curve cryptography right under the noses of everyone capable of making elliptic-curve cryptography.

I wouldn't count them out.

tux3 1 day ago|||
Mathematicians in academia are paid a little less than AI researchers. Companies are willing to pay billions to steal the few people capable of driving development of frontier LLMs from each other. Cryptographers don't quite enjoy the same popularity.
wolvoleo 22 hours ago||
Does getting paid more make people smarter?

Especially academia tend to do their work out of interest, their monetary gain isn't their primary goal

bigfatkitten 18 hours ago||
When people with a particular aptitude and skillset can make 10x as much money doing job A than job B, there is a bias towards job A.

Of course, that doesn’t mean nobody will do job B for other, non-financial reasons.

mpyne 1 day ago|||
> The NSA managed to deliberately insert a backdoor into elliptic-curve cryptography right under the noses of everyone capable of making elliptic-curve cryptography.

That sort of proves the opposite point, assuming you're referring to Dual EC DRBG, because the flaw was noticed very early on, by people who weren't even involved in its development.

polytely 1 day ago||||
They probably also have an insane dataset
stronglikedan 1 day ago||||
> do you think the NSA aren't training their own LLMs?

They probably already have access to Sentinel, so they wouldn't need to train their own.

curt15 1 day ago||||
Would they be able to hire top ML talent with US government salaries?
doug_durham 1 day ago||||
The NSA is government agency. They are certainly not training any world class LLMs. They probably have some specialized fine tunings of existing models, but that's it. They don't have the capacity.
segmondy 1 day ago||||
Serious question, do you realize that the NSA are mere mortals? Do you realize how much it takes to train a model? Does the NSA make their own chips or planes? The NSA buys a lot of technology because they can't make their own.
dofm 1 day ago|||
You mean "Rhetorical question," and I didn't need patronising.

They have at least one pretty vast, largely classified data centre in Utah, with a sizeable chunk of the black budget and they also have pretty large data sets.

halJordan 1 day ago||
Whats in Utah is data storage.
convolvatron 1 day ago|||
NSA has had their own supercomputing program for decades. they design and produce their own large scale machines. chips, fabrics, arithmetic units, all of it. they also employ quite a number of hardcore mathematicians, computer scientists, and systems wranglers. if they decided it was of strategic importance there is absolutely no reason they couldn't train their own models.
distill17801 1 day ago||
I guess we're just conspiracy theorists for landing at the objective conclusion that three letter government agencies:

- find "modern AI" to have strategic importance

- have ways to spend loads of money while having a front-facing budget on the record

- could be running a PR program to have Americans think they "buy" access to models like they do, but the AI companies were taken over by these agencies long ago

Look at Google, Microsoft...Apple got away with it by having as much on-device operation as possible so they could wash their hands, honestly saying "We don't have it."

This is the world's largest data gathering operation. Remember after 9/11 when the NSA copied as much Internet back bone traffic as they could?

I'm not for or against, even as a resident, but we certainly shouldn't be naive.

convolvatron 1 day ago||
as someone who actually worked at the NSA pointed out earlier in this thread, they have plenty of resources, but also plenty of politics and some execution problems. so I wouldn't put money on them making a great model, but to say that they are completely incapable of doing anything is probably quite wrong.

the issue here that is a forgone conclusion, regardless of where the model comes from and which chips it runs on, is that now they can reasonably comb through all the stuff that they've been collecting. that's a pretty huge operational change.

dgellow 1 day ago|||
You cannot really hide the amount of compute required to train an LLM. Do we have actual clues that NASA is training their own frontier model?
__MatrixMan__ 1 day ago|||
Are you proposing that this government is above being slave owners?
infinite_spin 1 day ago|||
the success of mythos isn't from model weights, it's from the harness and toolset it has access to
krzyk 1 day ago|||
Is it really?

Harness is important for model performance, but weights are surely mode important, without that you would have haiku doing the work.

dofm 1 day ago||||
I agree but that's even easier to exfiltrate, surely.
nickthegreek 1 day ago||
given some time, surely. but that seems harder with the model turned off.
sometimelurker 14 hours ago||||
source? credible rumor has mythos at 10 trillion params
FergusArgyll 1 day ago|||
Was Fable / Mythos in pi or opencode that much worse?
antonvs 1 day ago||
Probably, because those harnesses are less inclined to set all the tokens on fire in order to achieve a goal.
Onavo 1 day ago|||
If they use the defence production act, would Dario be even able to resign in protest?
AustinDev 1 day ago|||
If they wanted to officially take the weights the DPA would work and Dario could do nothing. If they wanted to do it in clandestine manner no one could stop them and no one would know. It's very likely they already have all the weights from all the frontier models. I mean all the frontier models are capable of being served from AWS Bedrock so the weights aren't exactly locked in some air-gapped vault.

It would be easy to make a national security justification to take the weights in a clandestine manner especially because Anthropic supposedly got caught giving China access to the model through a cutout.

JackFr 1 day ago||
Pretty sure even under DPA, taking without fair compensation would be a violation of the takings clause of 5th Amendment and wouldn't withstand legal scrutiny. If they wanted to get them clandestinely, yeah, they'd likely get away with it, but it is stealing.
torstenvl 1 day ago||
To be a taking, it would have to be property. Weights are almost certainly not property.
Onavo 1 day ago||
That's for the courts to decide.
torstenvl 21 hours ago||
Correct. What makes you think existing case law doesn't apply to model weights?
rurban 1 day ago|||
John Cook resigned, so Dario might resign also. But he would make it public, so they won't do it
Onavo 1 day ago||
> John Cook resigned

John Cook?

dofm 1 day ago|||
He means John Apple I think.
antonvs 1 day ago||
I think you mean Tim Mac
rurban 1 day ago|||
Oops, Tim Cook. Sorry
wetpaws 1 day ago||
[dead]
medlazik 1 day ago|
AI marketing bullshit stunts are unlike anything I've seen in 30 years. It started with MS Copilot so called capabilities for work, which were completely made up use cases that didn't work at all (3 years later still). We've had OpenAI "AGI is coming" and "AI will take your job", now we have Mythos being so "dangerous" for cybersecurity, which of course makes the average Joe interpret it as Anthropic being "the better overall company, the NSA uses it!!". I mean gov foes with Anthropic are probably true, but the marketing is to blame not Mythos capabilities. This is all so fucking pathetic
thewebguyd 1 day ago||
> and "AI will take your job"

Don't forget, its no longer cool to say that now that the public has pushed back. The fact they all changed their tone away from taking jobs tells you that it was all just entirely marketing.

yoyohello13 1 day ago|||
All the CEOs very quickly changed their messaging after Altman's house got molotoved.
scottyah 1 day ago|||
Seems to me that they were mostly right, and the message was received by the right people. No need to ensure it gets distributed to the wrong people.
chasd00 1 day ago|||
I haven’t heard anything about AGI in a long while. Oh yeah, and per conversations last Jan we were all supposed to be out of our jobs by now.
joquarky 23 hours ago||
I'm just glad there are so many jobs. Just look at the latest unemployment numbers! I wonder if this era of peace and prosperity will be remembered as the peak of humanity?

And did you see that chocolate rations increased again last month! It's literally incredible.

tempodox 1 day ago|||
But the propaganda deluge was a smash hit so far, HN is drowning in “AI” BS, and astroturfers and spin doctors haven’t seen that much business since the cold war. They made more profit than shovel salesmen in the gold rush.
colechristensen 1 day ago|||
I was able to identify, diagnose, fix, and upstream a minor bug in and erlang/OTP ssh key implementation with Opus in maybe 20 minutes (+2 weeks or so for upstream). It is not impossible that I could have done this before, but it would have taken days or weeks. The actual fix was about 2 lines of code, hardly AI slop, but getting there would have been quite the slog, and I never would have done it.

There is a lot of the reason for AI skepticism out there, but people tend to do massive overcorrections and underestimate the force multiplier it can be, particularly for people with some idea of what they're doing and a good grasp of how to take advantage of the tool.

medlazik 1 day ago|||
I said absolutely nothing about LLMs, which is a fantastic tool I'm using every day. I'm talking about marketing.
gallerdude 1 day ago|||
So let’s say you’re in Anthropic’s shoes. You see that LLM’s are getting better and better, and it’s very possible that they will have some impact on jobs in the next few years, and a very meaningful impact on cybersecurity.

Is it more ethical to stay silent about these concerns, as you might have a bit of self interest? Or even if it looks a bit self interested, is it better to warn people ahead of time? I think the latter is obviously the better position.

gazebo2 1 day ago|||
Are we really saying that Anthropic claiming AI would take over industries was some benevolent ethical move rather than marketing their product as a cheap replacement for human labor that works in any industry? Wouldn't the ethical thing, if they were actually concerned about labor displacement, be to shut down the lab and work to disrupt and disable other labs instead?
nl 1 day ago||
Oppenheimer believed that technological progress is inevitable: if something can be built it will be.

Anthropic (and Deepmind, and some at OpenAI) believe the same thing.

Their ethical argument is:

1) This technology is coming whether or not our company does it or not.

2) Strong AI needs to be under human control, and we are the best placed to develop techniques to make this happen.

To be very clear: Anthropic (at least) is very happy to restrict access to their best models. They have continually campaigned for regulation to make sure others have to do the same.

> Wouldn't the ethical thing, if they were actually concerned about labor displacement, be to shut down the lab and work to disrupt and disable other labs instead

Personally I strongly reject the idea that labor displacement is unethical.

It will be a serious problem to deal with, but that doesn't make it unethical.

The steam engine displaced labor. That doesn't make it unethical.

dlmanning 1 day ago||
> Personally I strongly reject the idea that labor displacement is unethical.

Oh, well if you STRONGLY reject it I guess that's it.

> It will be a serious problem to deal with, but that doesn't make it unethical.

What WOULD make it unethical?

> The steam engine displaced labor. That doesn't make it unethical.

The steam engine also created new jobs to replace what it eliminated. It wasn't a mostly one-sided wealth transfer to the elite.

nl 1 day ago||
> The steam engine also created new jobs to replace what it eliminated. It wasn't a mostly one-sided wealth transfer to the elite.

Indeed.

You make my point for me.

wolvoleo 22 hours ago||
What are those to be created jobs going to be doing that AI won't be able to?

There's two big differences with the steam machine: this change is happening much faster so society has much less time to adapt, and it's got a much wider scope. Steam machines only replaced a small category of jobs.

nozzlegear 1 day ago||||
Was it more ethical for the boy who cried wolf to have cried wolf so many times that nobody believed him when a wolf finally did show up?
aspenmartin 1 day ago||
Be specific, what are you talking about. Industry has been continuously warning about many of the complex problems that are going to happen as a clear consequence of the technology. I don’t know of any problem they have talked about that hasn’t either already come to fruition in one sense or another or that just hasn’t yet arrived. Dario has been predicting the end of coding for a long time now and look where we already are.

So yea no it’s more like it’s important for industry leaders and those closest to model development to proactively identify the issues that they don’t have complete control over or that we don’t have a regulatory framework for.

Super puzzling to see these comments and of course with zero specifics just “they’re all liars and grifters”

nozzlegear 23 hours ago||
I'm talking about the breathless alarmism that Dario and his company push out as a marketing strategy. They've given us such gems as these:

- "It’s a bit like selling nuclear weapons to North Korea" (from the company that can't go more than a day or two without serious downtime)

- "We are releasing a model that is too powerful for the public"

- "It would be good for the world to have the option to slow or temporarily pause frontier AI development."

- "I believe that biological risks may soon follow, and that serious AI autonomy risks may not be far behind."

You can fill my ear with nitpicks about there still being time for these cries of wolf to be born out, but be prepared for me to wax philosophical about all things being possible given an eternal timescale.

> Dario has been predicting the end of coding for a long time now and look where we already are.

Where? It seems exceedingly unlikely that developers have all been phased out while I wasn't looking, as Dario prognosticated. And even if they all up and disappeared, AI still hasn't found a toehold outside of the relatively niche market of agentic coding.

ifwinterco 1 day ago||||
The issue is both OpenAI and Anthropic have lied so many times that it’s no longer rational to take anything they say at face value.

Also: they don’t have to know they’re lying to say things that aren’t true. There is definitely some cult-like behaviour at the moment on the west coast

aspenmartin 1 day ago||
Be specific, what do you consider their lies to be? Also, this is pretty straightforward. You have a decade of extremely stable and predictable performance trajectory. It’s easy to see the writing on the wall. You can feel whatever which way about their motivations and ethics but if you read say Dario’s raw words they are pretty reasonable. We have to have a good regulatory framework and do what we can to prepare ourselves while also not ceding a critical strategic advantage. The west coast is always cult like, that’s not new. And it ignores the very real substance to the discussion.
ifwinterco 21 hours ago||
Every year since 2023 the models are too dangerous to release and in 12 months all white collar jobs will be obsolete. This might not have been a deliberate lie but it's clearly been untrue and they've said it again and again.

Predictions with wrong timing are frankly worthless. I predict at some point in the future the S&P 500 will be at 10,000. Of course I'm guaranteed to be right. But have I really predicted anything useful?

If Dario was really worried about protecting the sheep, he wouldn't cry wolf every five minutes because everyone knows that's the worst possible thing to do.

And if you want to ask if Altman is trustworthy... ask Satya Nadella or anyone else who's ever made the mistake of doing business with him

aspenmartin 7 hours ago||
> Every year since 2023 the models are too dangerous to release and in 12 months all white collar jobs will be obsolete. This might not have been a deliberate lie but it's clearly been untrue and they've said it again and again.

How is a prediction a lie? Did they tell you "this will definitely happen in X time"? Their speculation is not only valuable (they are the closest to the technology) but also necessary (they need to buy long term compute contracts so these predictions are literally what they have to bet their real money and company success on).

They have said again and again that this will make an incredible amount of tasks obsolete, and they are of course right about this. The models _are_ dangerous to release, every time we hit the frontier. This has become _increasingly true_.

> Predictions with wrong timing are frankly worthless.

Who cares?

> I predict at some point in the future the S&P 500 will be at 10,000. Of course I'm guaranteed to be right. But have I really predicted anything useful?

You aren't cherry picking and strawmanning here? Should we have a tour of all of the things that have indeed been predicted well and already come to fruition? Was 2025 "the year of agents"? It very much was, wasn't it? Additionally, unlike the S&P, performance trajectory, for almost a decade, is incredibly stable and predictable. It's hard to know, a priori for a given task or category of tasks, what specific error rate will trigger a phase transition but it's absolutely obvious and clear that this will happen quickly. It has indeed happened quickly. Does 2026 coding look anything remotely like 2024?

> If Dario was really worried about protecting the sheep, he wouldn't cry wolf every five minutes because everyone knows that's the worst possible thing to do.

No you're right he would make well reasoned arguments for the types of problems we need to address urgently. Hmm...that feels pretty ethical.

> If Dario was really worried about protecting the sheep, he wouldn't cry wolf every five minutes because everyone knows that's the worst possible thing to do.

I don't feel either of them are trustworthy, they are CEOs acting in their companies best interest. But people suggesting Mythos delay was some sort of PR ploy is some of the most extreme mental gymnastics I've seen. I listen to the actual words spoken by these people and consider the hard data that is in abundance at this point. I listen to the large body of research on alignment and safety and measurement that anyone can read for themselves or use AI agents to digest for them.

ifwinterco 6 hours ago||
I’ve just watched enough old Adam Curtis documentaries to know historically how these things always end, true believers in many dead ends have exactly this kind of zeal.

Very smart people, reasoned arguments, “science”, all wrong.

But maybe this time will be different

watwut 1 day ago|||
I think that Anthropic is fully absolutely unethical. And they lied a lot. They were actively trying to make the doom happen while trying to cash out maximally on doom trolling.

If they were actually concerned over social impact, they would try to minimize it. They could have sell their product as a tool to be used to make economy boom, they tried to sell it on promiss to make it shrink for most people.

It really does not matter how much they believed own doom predictions, because they were actively trying to make them true whether realistic or not.

fwipsy 1 day ago|||
Economic growth and short-term job loss are both results of automation. Anthropic seems to have been pretty honest about that to me?
watwut 20 hours ago||
If only they wrote in normal calm economic terms as you seem to imply ... and I wrote "shrinking economy for most people" not growing.
aspenmartin 1 day ago|||
> They were actively trying to make the doom happen while trying to cash out maximally on doom trolling.

These words make no sense. Anthropic delayed mythos/fable rollout. A mythos model without safeguards would have been a pretty bad idea, and they sacrificed a ton of revenue and risked being scooped by any of the other labs in the meantime. Frontier models are only frontier temporarily until the next lab releases their model. Of course they are a company and need to act in their own best interest.

It is also clearly serious the problems we need to think about as we march quickly towards even more capable systems. Why on earth is it a problem to point this out?

> If they were actually concerned over social impact, they would try to minimize it. They could have sell their product as a tool to be used to make economy boom, they tried to sell it on promiss to make it shrink for most people.

What a really weird take; they employ some of the best safety and alignment teams in the industry and this is an active area of research that they are campaigning for more attention on. You complain about them “doom trolling” and then complain they don’t do anything about…the doom? No sense at all.

It is perfectly consistent to (1) sound an alarm and (2) March full steam ahead as quickly as they can. If they don’t do (1) that’s unethical. If they don’t do (2) someone else will. I would rather someone like Dario align these models than the CCC. Plus it would be nice not to have a war over Taiwan which is inevitable if China gains enough of the upper hand in this AI race.

colechristensen 1 day ago|||
The point I'm trying to make is Anthropic's marketing about broad security risk related to the capability of its models is a valid concern though their dog and pony show really overdid it, probably to the detriment of us all for many reasons. It is indeed amplifying the abilities of people to find and exploit security issues.

The point of my anecdote is I was able to identify and fix an at least security adjacent bug in a language I could charitably consider myself a novice in. It happened to very unlikely have a security impact, but that was mere chance. LLMs expand the pool of people able to find and exploit security problems and we're all considerably more vulnerable as a result.

The biggest security threat was always someone bored with $20, a lot of attacks could be ignored or at least not prioritized with that threat model. This isn't true any more and our attack surface has gotten a whole lot larger.

aspenmartin 1 day ago||
What was the dog and pony show?
colechristensen 1 day ago||
White House and Anthropic hold 'productive' meeting amid fears over Mythos model https://www.reuters.com/world/anthropic-ceo-dario-amodei-arr...

This and other things around April

archagon 1 day ago||||
Force multiplier? Or low-hanging fruit pruner?
timeninja 1 day ago|||
What is the difference when every problem becomes low-hanging fruit?
archagon 1 day ago||
OP described a simple 2-line fix that would have been annoying to find by hand. That's a matter of heuristic search. The majority of problems in software engineering do not fall in this category.
colechristensen 1 day ago|||
More than low hanging fruit, I think it would have been legitimately hard to find. It only triggered 1/512 runs and probably would have required some expertise in crypto algorithms.

BUT regardless, pruning low hanging fruit for any task IS a force multiplier. So much of so many tasks are easy but tedious. Finding libraries, documentation not matching code thus reading code, correct syntax/arguments, and just tons of straightforward tasks which are not HARD but time consuming.

DyslexicAtheist 1 day ago|||
> I was able to identify, diagnose, fix, ...

a link to the PR or Changelog would strengthen this comment that it actually happened?

colechristensen 1 day ago||
Find it yourself. On any recently released erlang create an ssh server with their library. With the only available post-quantum algorithm connect to the server 1,000 times. You should get one or two key exchange failures (1/512 chance to fail)
expedition32 1 day ago|||
The US has gone all in on AI because it is one of the few things in which they still have an advantage over Asian countries. I wouldn't use the word pathetic but rather "desperation".
ianm218 1 day ago||
So is your position that i.e. the Five Eyes [1] cyber security leaders are just pretending that AI cyber security is a serious thing to play into the geopolitical east vs. west thing and its not genuine?

It just feels like people are starting to reach for conspiracy theories rather than engage with the idea that these models might actually be dangerous.

[1]. https://thehill.com/policy/technology/5936339-ai-cybersecuri...

michaelt 1 day ago||
The “Five Eyes cyber security leaders” aren’t exactly famous for their political independence, or for having the public’s best interests at heart, or erring on the side of regulating less.

You don’t get very far in the spying profession with honesty.

bflesch 1 day ago|||
We should seriously reframe this whole AI thing to "SI = simulated intelligence".

It's google in a box. Great achievement, makes knowledge work faster, but please stop bothering everyone else.

The Uber and Groupon people became billionaires, so the "Simulated Intelligence" folks will also achieve it. No need to worry and drown everyone in these bs stories only non-tech people believe.

ianm218 1 day ago|||
Can you describe your experience using modern AI tools that led to this conclusion? It is hard for me to wrap my head around how my perception could be so different from someone else in presumably the same or similar profession. I'm not asking this in bad faith either but I think your getting downvoted because your comment comes off as a pretty strong assertion without giving details on how you got there.
bflesch 1 day ago||
A lot of effort is spent to make the "conversation" feel just like a human-to-human interaction. This is not a naturally occurring phenomenon due to the technology, but rather a feature carefully engineered by those companies in order to get people hooked. Then they have all these tiny nudges like the typing animations or the "thinking..." popups before the next chat message appears.

At some point you might have also noticed the over-use of emojis, the bolted-on jokes, and the tendency to always approve what the user says (even though they have toned that down after backslash). At some point too many people thought they were in a relationship with the chatbot, because it always encouraged and approved them, so they had to hotfix it.

It's a bunch of really dark psychological patterns that are carefully combined by very clever people in order to create the false illusion that the user is experiencing something deeper than an engineered simulation of human interaction.

I think the technology is really useful, but they are obviously not happy with simply replacing a google-like query interface, they want users to fall in love with the product and mentally treat it like a fellow human being - and that's what I think is insincere.

ianm218 1 day ago|||
To get more concrete are you using coding agents like Claude Code/ Codex/ opencode etc? What kind of work are you doing specifically?

If you are doing the kind of median enterprise tech work these tools are just good enough to do it at a relatively high level or atleast heavily augment people doing it.

Examples would be like adding routine CRUD features to APIs/ improving observability/ adding tests or accessibility features to codebases etc.

bflesch 19 hours ago||
It try to explain it better in my longer sibling comment. I'm not using any coding agents. Their engineers can't be bothered to design their own webapp properly so I don't trust their binaries.

For me both Claude and ChatGPT are query-response services and replacements for google. They help with error messages, single-file MVPs, and software design problems such as comparison of different modules.

In my experience everything that goes beyond 200 lines creates issues down the line, so I try to keep interactions really short. Of course they can convincingly add CRUD functionality or tests, but one needs to double check their correctness, and if the subtle bugs are finally spotted then one needs to fix them anyways.

It's good for a first draft but I wouldn't use agents on a codebase I actually care about.

Unfortunately the billion-dollar funding forces the AI startups to make a return, and they are finding it in a vulnerable cohort of people who respond positively to a simulated human interaction, which is why they are focusing so much on it.

The query-response knowledge interface was the moat of google, and nowadays it can be 80% replaced with a local GPU and an open model. They know it, which is why they try to hook people on the simulated human interaction aspect of their interfaces through chatbots and voice chat.

pixl97 1 day ago||||
>A lot of effort is spent to make the "conversation" feel just like a human-to-human interaction.

We'll in humans we call this an education and it takes quite a long time to get one.

bflesch 19 hours ago||
Not a good comparison. Education is the part where they train on all digital content they can get their hands at, no matter the copyrights.

You get your education, you can replace google as a query-response interface to all digital content.

But then they use system prompts to simulate a fake persona and a user interface such as female voices or chat conversation in order to suggest that one is interacting with a real human being. This is clearly aimed at exploiting vulnerable cohorts of people, because the knowledge base part of this innovative technology is already solved.

Like casinos and social media companies, they know the profit is in the "whales" who can be psychologically manipulated to spend their time and money against their own interests.

pixl97 7 hours ago||
This is oddly conspiracy theory oriented.

How would you program a LLM so it gives useful information to people with the least amount of people bitching about it?

At the end of the day the LLM does not have a native persona. It has countless numbers of them. It can act like an autistic man, a flirty woman, a kid from some country you've never heard of. Bringing forth an agreeable persona from the myriad is a bad thing?

fwipsy 1 day ago|||
POV OpenAI, early 2021. You have a pretty good next-token predictor called GPT-3. You noticed sometimes it can do useful things if you write out the start of a task and let it predict the next step. However, sometimes it's very difficult to frame a task that way. So instead you train it to predict the answer based on a question or instructions. Oh wait, it didn't get it right the first time... better let users iterate too. Now you have a conversation.

Things like loading indicators are basic good UI dating back to the 90s.

A/B testing and generally following user preferences might still push towards the dynamic you're describing, as it did with gpt-4o. xAI and a few other companies like Replika also intentionally created "companion"/porn AIs. But in general, natural language was previously exclusive to humans. It's completely natural that the first technology capable of it would therefore be perceived as more human. It's worth trying to resist this tendency, but it doesn't require evil intent on the part of the creators.

bflesch 20 hours ago||
The "conversation" interface is exactly the same workflow as one used with google search: user states a query, page loads, user adapts query because they are not happy with the result, until they end up with a suitable result which makes them close the tab.

So they have made this amazing query-response system which is far superior to google due to the summarization of query results from the global web and the auto-translation to present them in the user's native language. This is the type of raw query-response capability which many software engineers are trying to use in their agentic coding sessions.

However, after achieving such innovation, the AI startups consciously choose to apply social media KPIs to their query-response startup, which incentivizes all the dark patterns we have seen in their user interface. They notice that a certain subset of users can be tricked into believing that the startup's query-response interface has human-like qualities such as a name and persona.

This user cohort shows amazing metrics in terms of time spent on app, so they adapt their user interface and their system prompts accordingly. The AI startup doesn't have to care if the reason for humans accepting the illusion of a simulated human interaction is due to social circumstances (lack of emotional intimacy) or an underlying psychological vulnerability that the startup is actively exploiting.

The AI startup only cares if their "simulated human interaction" product receives negative attention from normal people who are not part of the vulnerable cohort, e.g. the suicides or the parasocial romantic relationships with the chatbots.

It is exactly the same as in the gambling industry: There is a certain subset of users called "whales" who are the cash cows for casinos, but if you look at the actual humans who are labeled with this term one can see pathological gamblers, most of which are ruining their lives and families. Casinos do everything to prevent people from jumping from their roofs after they lost all their money.

If AI startups can use simulated human interactions to make vulnerable people act against their own interests in the same way as casinos and social media companies do, it will allow them to make shitloads of money.

But if you're actually a clever person then be honest to yourself and others about what you are working on, and why these human-like features are really added to the user interfaces of OpenAI or Anthropic or the other AI startups.

So this is my framing of the situation.

I don't think this kind of problem can be overlooked by the insiders, and we might see some internal rifts along these lines: Will our AI startup simulate a human interaction in order to exploit our vulnerable peers, or will our AI startup focus on delivering the best response to our user's queries?

Because now we have local models, which - assuming one has suitable hardware - provide 80% of the utility in terms of a query-response knowledge base.

As we are currently seeing, the AI startups with billion-dollar funding have very big economic incentives to focus on the "simulated human interaction" part of the equation, because their investors need returns.

The biggest strategic blunder I see at Google. Because if Google actually changes their excellent query-response user interface to a chat conversation which simulates a human interaction with persona, name, and voice, then they knowingly pivot to the same social media KPI driven business as OpenAI and Anthropic are struggling with.

fwipsy 2 hours ago||
Feels like OpenAI and Anthropic are both more interested in B2B. OpenAI I'm more suspicious of after GPT-4o. I mostly use Claude and I haven't noticed anything that feels like an intentional dark pattern. It's constantly reminding me that it's a chatbot.
AnimalMuppet 1 day ago||||
Heh. In the Schlock Mercenary universe, "SI" means "synthetic intelligence", which is a level below real AI (which means what we would call AGI). And, as it says (in https://www.schlockmercenary.com/2003-07-21), SI translates to "kinda stupid".
ToucanLoucan 1 day ago||
All for a product that has yet to make a single honest dollar in profit for anyone who isn't nvidia.

When this goes we might well see a recession. Not that anyone responsible will be worse off, of course.

tempodox 1 day ago|||
The perpetrators all have their golden parachutes. The taxpayers will foot the bill.
expedition32 1 day ago||
The US is trillions in debt. We live in the age of magic- nobody foots the bill.
scottyah 1 day ago|||
Why on earth would you expect any of them to take profit so early in the game?
ToucanLoucan 1 day ago||
Silly me, expecting a company worth a trillion dollars to make... some money. Any money. A single profitable product.
nl 1 day ago|||
Anthropic is running at an operating profit this quarter: https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-...

(It's actually probably more profitable than their projections here calculated because they were expecting to be running Fable but can't, and Opus costs less to run)

aspenmartin 1 day ago|||
Well the good thing is you’ve done the homework to definitively demonstrate that this is remotely true. These confident claims of this all being some sort of unprofitable Ponzi scheme, not understanding the concept of a growth phase which a multitude of highly successful tech companies have already demonstrated work while simultaneously commenting on a site with YCombinator in the url are just getting amusing now.

Of course this is a profitable technology, and it doesn’t matter if any of the labs are profitable today or not. Running at a loss is a perfectly rational strategy.

More comments...