Posted by jnord 13 hours ago
Maybe the common factor here is not having deep/sufficient knowledge on the topic being discussed? For the article I mentioned, I feel like I was less focused on the strength of the writing and more on just understanding the content.
LLMs are very capable at simplifying concepts and meeting the reader at their level. Personally, I subscribe to the philosophy of - "if you couldn't be bothered to write it, I shouldn't bother to read it".
I just don't know what's supposed to be natural writing anymore. It's not in the books, disappears from the internet, what's left? Some old blogs for now maybe.
“what X actually is”
“the X reality check”
Overuse of “real” and “genuine”:
> The real story is actually in the article. … And the real issue for Cursor … They have real "brand awareness", and they are genuinely better than the cheaper open weights models - for now at least. It's a real conundrum for them.
> … - these are genuinely massive expenses that dwarf inference costs.
This style just screams “Claude” to me.
It has enough tells in the correct frequency for me to consider it more than 50% generated.
Popular content is popular because it is above the threshold for average detection.
In a better world, platforms would empower defenders, by granting skilled human noticers flagging priority, and by adopting basic classifiers like Pangram.
Unfortunately, mainstream platforms have thus far not demonstrated strong interest in banning AI slop. This site in particular has actually taken moderation actions to unflag AI slop, in certain occasions...
anthropic doesn't have that. single provider, single pricing decision. whether or not $5k is accurate the more interesting question is what happens to inference pricing when the supply side is genuinely open. we're seeing hints of it with open router but its still intermediated
not saying this solves anthropic's cost problem, just that the "what does inference actually cost" question gets a lot more interesting when providers are competing directly
People in comments have assumption that Atropic 10 times bigger than chinese models so calc cost is 10 times more.
But from perspective of Big O notation only a few algorithms gives you O(N). Majority high optimized things provide O(N*Log(N))
So what is big O for any open model for single request?
However I think it's fair to say the cost is roughly linear in the number of users other than that.
There may be some aspects which are not quite linear when you see multiple users submitting similar queries... But I don't think this would be significant.
As for LLM, there is probably some cost constant added once it can fit on a single GPU, but should probably be almost linear.
1. It would be nice to define terms like RSI or at least link to a definition.
2. I found the graph difficult to read. It's a computer font that is made to look hand-drawn and it's a bit low resolution. With some googling I'm guessing the words in parentheses are the clouds the model is running on. You could make that a bit more clear.
…You could take efficiency improvement rates from previous models releases (from x -> y) and assume; they have already made “improvements” internally. This is likely closer to what their real costs are.
I wonder if a better proxy would be comparing by capability level rather than size. The cost to go from "good" to "frontier" is probably exponential, not linear - so estimating Anthropic's real cost from what it takes to serve Qwen 397B seems off.
API inference access is naturally a lot more costly to provide compared to Chat UI and Claude Code, as there is a lot more load to handle with less latency. In the products they can just smooth over load curves by handling some of the requests slower (which the majority of users in a background Code session won't even notice).
[1] https://www.wheresyoured.at/anthropic-is-bleeding-out/ [2] https://www.wheresyoured.at/costs/
> this company is wilfully burning 200% to 3000% of each Pro or Max customer that interacts with Claude Code
There is of course this meme that "Anthropic would be profitable today if they stopped training new models and only focused on inference", but people on HN are smart enough to understand that this is not realistic due to model drift, and also due to comeptition from other models. So training is forever a part of the cost of doing business, until we have some fundamental changes in the underlying technology.
I can only interpret Ed Zitron as saying "the cost of doing business is 200% to 3000% of the price users are paying for their subscriptions", which sounds extremely plausible to me.
> My LinkedIn and Twitter feeds are full of screenshots from the recent Forbes article on Cursor claiming that Anthropic's $200/month Claude Code Max plan can consume $5,000 in compute.
So the article's title is obviously sensationalized.
Also, while Opus certainly is a lot better than even the best Chinese models, when I max out my Claude plan, I make do with Kimi 2.5. When factoring in the re-run of changes because of the lower quality, I'd spend maybe 2x as much per unit of work I were to pay token prices for all my monthly use w/Kimi.
I'd still prefer Claude if the price comes down to 1x, as it's less hassle w/the harder changes, but their lead is effectively less than a year.
I thought there was no moat in AI? Even being 10x costlier, Anthropic still doesn't have enough compute to meet demand.
Those "AI has no moat" opinions are going to be so wrong so soon.
So no, Claude would not be getting NEARLY as much usage as it's currently getting if it weren't for the $100/$200 monthly subscription. You're comparing Kimi to the price that most people aren't paying.