Posted by rarisma 12 hours ago
I was a happy Max user.
Here there be monsters, and we don't have any real way of evaluating risk; and the leverage provided by tools already available affords systemic and even existential risk in a way no one—least of all an industry committed to shareholder value—has had to navigate, let alone with a million backseat drivers each with their own substack and brand to build.
Does "SORRY" fix the deception these models use on the sly?
Does "SORRY" not silently downgrade you to a shittier model without notification?
Does "SORRY" refund your tokens or money?
Im guessing NO to all of those. Standard corporate sorry of "We're sorry youre offended and stupid and gullible".
They just showed that they CAN do this right in front of you. Local open weight models are a necessity.
But also, it isn’t the only huge mistake Anthropic has made in the last 48 hours. Having a sneaky data retention policy, while also giving companies no way to block Fable, is a massive problem. And it is ridiculous that Anthropic has so little respect for its customers. OpenAI should take advantage of this.
They are clear about the reasons for guardrails: prevent their models from doing harm in dual-use contexts including CBRN or by accelerating research in authoritarian-backed AI labs.
What is the critique against that? It seems pretty reasonable to me. You want AI-accelerated biological or radiological experiments running in your neighbors backyard? You want PRC-backed labs to continue to steal Anthropic's models via distillation?
Mitigating the harms of dual-use tech is notoriously difficult and fraught with trade offs. What I would want to see is cautious rollout and quick response, which is EXACTLY what they're doing.
Instead, this thread is full of bad-faith arguments about Anthropic being dishonest, making a "useless" model, or "the power is going to their heads." You can't read Anthropic's System Cards and come away with any of these impressions. Quite the opposite, in fact. They are honest to a fault, acknowledging problems they discovered even when it hurts them.
If your harmless request was downgraded to Opus, you're billed for Opus. They were 100% clear about that. I'd much rather have a Mythos-class model that falls back to Opus 10% of the time than be capped to Opus 100% of the time. If that doesn't work for you, then make a suggestion for something better!
If you are a white-hat security engineer hitting guardrails, I don't think you have standing to complain. I really don't. Their Glasswing program actually got banks and the industrial sector to take action to fix security vulnerabilities. Do you realize how special that is? A huge portion of the economy runs on vulnerable code and has for decades, despite security experts testifying to Congress, begging business leaders, pleading for intervention-- with no results. But suddenly they're all enrolled in a program that will find *and fix* vulnerabilities! White-hat security people should be rejoicing. Instead some of them are throwing rocks. Unbelievable. Shameful.
Meanwhile, society is screaming at the AI labs to be more conscientious about potential harms of AI. Legislatures are passing laws limiting data center construction. There are protests. And you, the HN community, the vanguard of our profession, have the temerity to demand "NO GUARDRAILS!" "HOW DARE YOU TRY TO PROTECT DEMOCRACY!" "MY SOFTWARE PROJECT IS MORE IMPORTANT THAN KEEPING NUKES AWAY FROM THE BAD GUYS!"
Go ahead HN, downvote me. It'd be an honor.
"Distillation involves training less capable models on more advanced ones’ output, and can be used illicitly to acquire powerful capabilities cheaply. The AI startup accused China’s DeepSeek, MiniMax, and Moonshot of generating 'over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts,'"
https://www.semafor.com/article/02/24/2026/anthropic-accuses...
After reading their posts and watching interviews with Dario it's abundantly clear that they view Chinese-lab distillation of US frontier models as a threat to US national security. You can argue with them about whether that is true, but not whether distillation is real.
What accounts for the difference between your attitude that distillation is no big deal, "common practice," yet Anthropic sees as it as a huge threat?
Why not just tell people, "To defend our ability to be competitive in our industry, we ask that you do not use Claude or any of our models to independently perform research on large language models or any of its related architectures or technologies. In order to prevent this violation of the Terms of Service, we have trained Claude Fable to deny any requests or prompts which involve frontier AI research."