Top
Best
New

Posted by jnord 13 hours ago

No, it doesn't cost Anthropic $5k per Claude Code user(martinalderson.com)
247 points | 180 commentspage 2
z3ugma 9 hours ago|
This is such a well-written essay. Every line revealed the answer to the immediate question I had just thought of
lovecg 8 hours ago|
I can’t get past all the LLM-isms. Do people really not care about AI-slopifying their writing? It’s like learning about bad kerning, you see it everywhere.
crakhamster01 5 hours ago|||
I had a similar reaction to OP for a different post a few weeks back - I think some analysis on the health economy. Initially as I was reading I thought - "Wow, I've never read a financial article written so clearly". Everything in layman's terms. But as I continued to read, I began to notice the LLM-isms. Oversimplified concepts, "the honest truth" "like X for Y", etc.

Maybe the common factor here is not having deep/sufficient knowledge on the topic being discussed? For the article I mentioned, I feel like I was less focused on the strength of the writing and more on just understanding the content.

LLMs are very capable at simplifying concepts and meeting the reader at their level. Personally, I subscribe to the philosophy of - "if you couldn't be bothered to write it, I shouldn't bother to read it".

ajkjk 4 hours ago||
Alternate theory... a few months into the LLMism phenomenon, people are starting to copy the LLM writing style without realizing it :(
amonith 2 hours ago||
This happens to non-native English speakers a lot (like me). My style of writing is heavily influenced by everything I read. And since I also do research using LLMs, I'll probably sound more and more as an AI as well, just by reading its responses constantly.

I just don't know what's supposed to be natural writing anymore. It's not in the books, disappears from the internet, what's left? Some old blogs for now maybe.

weird-eye-issue 7 hours ago||||
I think you're just hallucinating because this does not come across as an AI article
lovecg 7 hours ago|||
I see quite a few:

“what X actually is”

“the X reality check”

Overuse of “real” and “genuine”:

> The real story is actually in the article. … And the real issue for Cursor … They have real "brand awareness", and they are genuinely better than the cheaper open weights models - for now at least. It's a real conundrum for them.

> … - these are genuinely massive expenses that dwarf inference costs.

This style just screams “Claude” to me.

hansvm 7 hours ago||||
It was almost certainly at least heavily edited with one. Ignoring the content, every single thing about the structure and style screams LLM.
lelanthran 6 hours ago||||
> I think you're just hallucinating because this does not come across as an AI article

It has enough tells in the correct frequency for me to consider it more than 50% generated.

NetOpWibby 7 hours ago|||
Name checks out
raincole 4 hours ago||||
It's really unfortunate that we call well-structured writing 'LLM-isms' now.
Erem 7 hours ago||||
I don’t see the usual tells in this essay
152334H 7 hours ago||||
People care, when they can tell.

Popular content is popular because it is above the threshold for average detection.

In a better world, platforms would empower defenders, by granting skilled human noticers flagging priority, and by adopting basic classifiers like Pangram.

Unfortunately, mainstream platforms have thus far not demonstrated strong interest in banning AI slop. This site in particular has actually taken moderation actions to unflag AI slop, in certain occasions...

rhubarbtree 6 hours ago|||
It is certainly very obvious a lot of the time. I wonder if we revisited the automated slop detection problem we’d be more successful now… it feels like there are a lot more tells and models have become more idiosyncratic.
weird-eye-issue 5 hours ago||
Tons of companies do this already. It's not like this is a problem that nobody is constantly revisiting...
jeff_antseed 6 hours ago||
the openrouter comparison is interesting because it shows what happens when you have actual supply-side competition. multiple providers, different quantizations, price competition. the spread between cheapest and priciest for the same model can be 3-5x.

anthropic doesn't have that. single provider, single pricing decision. whether or not $5k is accurate the more interesting question is what happens to inference pricing when the supply side is genuinely open. we're seeing hints of it with open router but its still intermediated

not saying this solves anthropic's cost problem, just that the "what does inference actually cost" question gets a lot more interesting when providers are competing directly

vmykyt 2 hours ago||
I have very naive question:

People in comments have assumption that Atropic 10 times bigger than chinese models so calc cost is 10 times more.

But from perspective of Big O notation only a few algorithms gives you O(N). Majority high optimized things provide O(N*Log(N))

So what is big O for any open model for single request?

fancyfredbot 2 hours ago||
It's a good question. Costs will be lumpy. Inference servers will have a preferred batch size. Once you have a server you can scale number of users up to that batch size for relatively low cost. Then you need to add another server (or rack) for another large cost.

However I think it's fair to say the cost is roughly linear in the number of users other than that.

There may be some aspects which are not quite linear when you see multiple users submitting similar queries... But I don't think this would be significant.

rat9988 2 hours ago||
N*Log(N) can be approximated to O(N) for most realistic usecases.

As for LLM, there is probably some cost constant added once it can fit on a single GPU, but should probably be almost linear.

n_u 7 hours ago||
Good article! Small suggestions:

1. It would be nice to define terms like RSI or at least link to a definition.

2. I found the graph difficult to read. It's a computer font that is made to look hand-drawn and it's a bit low resolution. With some googling I'm guessing the words in parentheses are the clouds the model is running on. You could make that a bit more clear.

ineedaj0b 4 hours ago||
What CC costs internally is not public. How efficient it is, is not public.

…You could take efficiency improvement rates from previous models releases (from x -> y) and assume; they have already made “improvements” internally. This is likely closer to what their real costs are.

akhrail1996 3 hours ago||
The comparison with Qwen/Kimi by "comparable architecture size" is doing a lot of heavy lifting. Parameter count doesn't tell you much when the models aren't in the same league quality-wise.

I wonder if a better proxy would be comparing by capability level rather than size. The cost to go from "good" to "frontier" is probably exponential, not linear - so estimating Anthropic's real cost from what it takes to serve Qwen 397B seems off.

brianjeong 8 hours ago||
These margins are far greater than the ones Dario has indicated during many of his recent podcasts appearances.
skybrian 7 hours ago|
What did he say?
vbezhenar 3 hours ago||
Why does Claude charge 10x for API, compared to subscriptions? They're not a monopoly, so one would expect margins to be thinner.
preommr 47 minutes ago||
It's why every integration basically tries to piggyback off of a subscription, and why Anthropic has to continuously play whack-a-mole trying to shut those services down.
hobofan 3 hours ago||
Monopoly isn't the only thing that allows you to charge large margins.

API inference access is naturally a lot more costly to provide compared to Chat UI and Claude Code, as there is a lot more load to handle with less latency. In the products they can just smooth over load curves by handling some of the requests slower (which the majority of users in a background Code session won't even notice).

functionmouse 13 hours ago||
Was anyone under the impression that it does? Serious question. I've never heard that, personally.
versteegen 7 hours ago||
Ed Zitron made that claim (in particular here: [1]). In the same article he admits he not a programmer, and had to ask someone else to try out Claude Code and ccusage for him. He doesn't have any understanding of how LLMs or caching works. But he's prominent because he's received leaked financial details for Anthropic and OpenAI, eg [2]

[1] https://www.wheresyoured.at/anthropic-is-bleeding-out/ [2] https://www.wheresyoured.at/costs/

sunaurus 3 hours ago||
Maybe I'm misreading it, but I don't see him saying it's just the cost of *inference* alone (which is the strawman that the article in the OP is arguing against). He says:

> this company is wilfully burning 200% to 3000% of each Pro or Max customer that interacts with Claude Code

There is of course this meme that "Anthropic would be profitable today if they stopped training new models and only focused on inference", but people on HN are smart enough to understand that this is not realistic due to model drift, and also due to comeptition from other models. So training is forever a part of the cost of doing business, until we have some fundamental changes in the underlying technology.

I can only interpret Ed Zitron as saying "the cost of doing business is 200% to 3000% of the price users are paying for their subscriptions", which sounds extremely plausible to me.

simianwords 6 hours ago|||
You would be surprised because there are lots of posters here who think that the cost is so enormous that this whole industry is unviable.
crazygringo 8 hours ago|||
I mean, the very first paragraph of TFA is describing who is under that impression. Literally the first sentence:

> My LinkedIn and Twitter feeds are full of screenshots from the recent Forbes article on Cursor claiming that Anthropic's $200/month Claude Code Max plan can consume $5,000 in compute.

fulafel 7 hours ago||
That's claiming that worst case, a subscriber _can_ use that much. It's possible that's wrong too, but in any case a lot of services are built on the assumption that the average user doesn't max out the plan.

So the article's title is obviously sensationalized.

vidarh 5 hours ago||
I have no problem believing that a Claude Max plan can consume equivalent to $5000 worth of retail Opus use, but one interesting thing you'll see if you e.g. have Claude write agents for you, is that it's pretty aggressive about setting agents to use Sonnet or even Haiku, so not only will most people not exhaust their plans, but a lot of people who do will do so in part using the cheaper models. When you then factor in Anthropics reported margins, and their ability to prioritise traffic (e.g. I'd assume that if their capacity is maxed out they'd throttle subscribers in favour of paid by the token? Maybe not, but it's what I'd do), I'd expect the real cost to them of a maximised plan to be much lower.

Also, while Opus certainly is a lot better than even the best Chinese models, when I max out my Claude plan, I make do with Kimi 2.5. When factoring in the re-run of changes because of the lower quality, I'd spend maybe 2x as much per unit of work I were to pay token prices for all my monthly use w/Kimi.

I'd still prefer Claude if the price comes down to 1x, as it's less hassle w/the harder changes, but their lead is effectively less than a year.

dimgl 9 hours ago||
Twitter.
aurareturn 7 hours ago|
By the way, one of the charts in the article shows that Opus 4.6 is 10x costlier than Kimi K2.5.

I thought there was no moat in AI? Even being 10x costlier, Anthropic still doesn't have enough compute to meet demand.

Those "AI has no moat" opinions are going to be so wrong so soon.

spiderice 7 hours ago||
Claude Code Max obviously doesn't cost 10x more than Kimi. The article even confirms that you can get $5k worth of computer for $200 with Claude Code Max.

So no, Claude would not be getting NEARLY as much usage as it's currently getting if it weren't for the $100/$200 monthly subscription. You're comparing Kimi to the price that most people aren't paying.

jdjfnfndn 3 hours ago||
[dead]
More comments...