AI's Affordability Crisis

Posted by ilreb 7 hours ago

AI's Affordability Crisis(blog.dshr.org)

203 points | 257 comments

steveBK123 6 hours ago|

I think the biggest problem is not necessarily the cost to develop & serve the models, but how quickly user behavior changed with token based pricing.

I know a lot of people at companies where the marching orders changed on a dime end of Q1/start of Q2. These are shops that were fully on the "use AI or die (because we will fire you)" train.

Now there's monitoring, reporting, alerting not just on overall cost but on "over-use" of best/priciest models based on total-or-percent tokens/dollars, etc. All of this comes with direct developer engagement & standardized management escalation for holding it wrong.

To me this customer behavior does not smell like a product you can 10x the pricing on to get profitable. We have exited the exploration phase and now ROI matters.

burningChrome 5 hours ago||

I can give you some additional anecdotal evidence to support your comment.

I work at a Fortune 200 company. At first, it was the Wild West. Need an LLM? You got it. Need to or want to build an army of agents? Done and done. We literally had everything at the tips of fingers for about 3 months. Teams were building their own internal tools, the team I work on canceled contracts with several software vendors because teams were building the same tools for what they thought was nothing.

Then they signed contracts with Anthropic and Google because I would assume they saw the token usage was through the roof. One month later? They completely cut off access to everybody for both Claude and Gemini. If you wanted access? Suddenly it was several forms, along with several approvals and a rock solid business case why you needed it. And before you got to the forms? You were added to a waiting list that was thousands of people long.

The entire company is now in damage control after trying to get the genie back in the bottle. I'm guessing someone saw how much we would be paying for the tokens we'd been using and decided to shut the party down so to speak.

sdesol 5 hours ago||

Was there at least performance gains to be measured?

burningChrome 5 hours ago||

AFAIK nobody was collecting analytics. The one team I was working on had put out a goal of "30% more efficient" using AI tools. Its about as subject as you can get. We never got around to what exactly that meant before everything got shut down.

Myself and several other devs were laughing about the whole thing. The company was so amped about what AI could do they never even bothered collecting any analytics that would affirm or deny any of this had a positive impact. Even some of my team members were talking about the placebo effect AI has had on a lot of C-Suite folks.

Gigachad 8 minutes ago|||

Our company went from “AI AI AI” to “GitHub Copilot has been suspended due to exceeding the budget” with this month’s price increase.

dranudin 6 hours ago|||

I can second this. Our company and department was all-in on AI. And since the token-based pricing came in, we got an email from IT that tried to explain that most developers don't know how to choose models and that the cheap models should be good enough for most of our work ..

verdverm 5 hours ago||

Have they built an internal ai enablement team?

dranudin 5 hours ago||

Yes :D

piker 6 hours ago|||

I.e., the demand for programming tokens turns out to be quite elastic.

steveBK123 6 hours ago|||

I would imagine it only gets worse in the face of good-enough open/chinese/local models too right?

Microsoft adding Deepseek support already as I recall?

That is - for any definition of "they are behind X months" then eventually they get to the point Claude was in January when the world freaked out, but at 1/10th the cost. A lot of firms are going to mandate that is good enough for their developers.

Gigachad 5 minutes ago|||

Our org straight up turned off our AI access after the GitHub copilot price increase blew right through the budget.

At least on a personal I feel like I’ve been getting the same amount of work done but I have to think harder rather than sitting back and prompting and waiting.

michaelchisari 13 minutes ago||||

I'm set up to use Qwen 3.6 locally if needed. It's solid, it does what I need, it runs on my laptop and it's free.

But that's because I never got on the "run three dozen agents in a ralph loop" trend or other high-token usage methods. The way I use AI is discrete and targeted and it seems that's how it will be for everyone once the economics settle.

sdesol 6 hours ago||||

> Microsoft adding Deepseek support

I believe this hasn't been confirmed yet but I think it speaks to a bigger problem for the AI companies which is, if you give capable developers a good reasoning LLM, they can make it work like it was a really expensive model.

I believe we are 100% at the stage of good enough for the vast majority of tech companines. Fable and others will be more valuable for non-traditional tech companies.

I read somewhere that the chinese AI companies are sharing knowledge and it would not surprise me if the government is applying pressure by saying work together or else. If they work together, they can truly commoditize LLMs and with China ramping up hardware support for AI, I see the future being inference speed and hardware being the moat.

thewebguyd 6 hours ago||

If hardware becomes the moat, the US frontier labs are screwed. We have AWS, Azure, GCP. All three have or are making inference silicon. LLMs become just another service in the public cloud's large service catalog, and open weight wins.

Which makes sense to me. Selling a chatbot interface/model access to the general public was never going to be a viable long term play. You still need developers to wrap the models into specialized tools. Queue the Jobs quote "It's a feature, not a product."

KolibriFly 3 hours ago|||

The funniest thing would be if in a couple years LLMs just end up being another checkbox next to PostgreSQL and Kubernetes

thewebguyd 2 hours ago||

I don't think that's far fetched at all either and is probably the end game ultimately. No one wants to buy a chatbot, they want to automate something with it. Intelligence is just another PaaS offering right next to storage, compute.

The only hiccup in that happening is will the US Gov let Anthropic and/or OpenAI fail when that time comes.

sdesol 5 hours ago|||

The big thing is, the western world has moved so much of the manufacturing to China and think a lot of people will not forgive Samsung and others, so I can see China owning a good portion of the supply chain.

johnvanommen 5 hours ago||

> The big thing is, the western world has moved so much of the manufacturing to China

I built my career on Solaris and it got rugpulled by Linux.

That wasn’t because of software, it was because of hardware. Linux’s cost advantage existed because Sun hardware had huge margins, because their software was basically free.

AI will probably be a repeat of this. Whoever can come up with the hardware solution that minimizes the cost per token will win.

I believe the 5090 still holds this crown, but someone certainly knows better than I do.

rescbr 4 hours ago|||

While people fly to the US to buy Macs at a lower price and bring them back in their backpacks, I guess I'll be flying to HK to buy a Chinese GPU rather sooner than later...

trollbridge 4 hours ago||||

Fortunately, Solaris skills map to Linux pretty cleanly.

fragmede 4 hours ago|||

but not all tokens are equal and vertical integration is the name of the game. Solaris did not lose to Linux, it lost to the LAMP stack on commodity x86 hardware. without the "AMP" part, Linux would've been dead in the water.

CuriouslyC 6 hours ago||||

100%. There will be strict quotas on the expensive models and day to day work will be done on the cheap models that are "good enough" with escalation to the metered models when the cheaper options are spinning their wheels. Eventually the US frontier lab APIs will only get the most heavily triaged work that multiple tiers of cheaper Chinese open weight models have failed on.

And of course the C-suite will have unlimited access to Mythos tier models, which they'll use to summarize reports, while passing down mandates to rank and file to increase usage of less expensive models.

verdverm 5 hours ago|||

Yup, we are in the process of getting access to US hosted Chinese models. I've been petitioning Google and our rep, we will see but I suspect they will cave eventually. Gemini sucks and if they don't sell what their customers want, we go shopping around.

bloppe 1 hour ago||

What you want is already available on OpenRouter and a million other services, but sure, you can wait 18 months for it to be on GCP.

verdverm 59 minutes ago||

> We are in the process...

OpenRouter charges an extra 5.5%, Fireworks does not, Google is separate, but I doubt it will take 18 months. They are already aware they are losing business.

OpenRouter is the wrong abstraction for enterprise, we only need one model provider, not everyone in the world. Nor do we want to have to worry about failover going to providers we don't want.

jayd16 6 hours ago|||

If folks won't pay a higher price, doesn't that mean it's inelastic?

unholiness 6 hours ago|||

"Elastic" in economics happens to refers to how elastic the supply/demand is when the price changes (not vice versa, as you're describing). So e.g. an inelastic demand means the quantity demanded changes very little when the price doubles.

steveBK123 6 hours ago|||

Elastic demand means buyers are highly sensitive; a price hike causes a massive drop in purchases. Inelastic demand means buyers aren’t very sensitive; they keep buying regardless of price

jayd16 6 hours ago||

Ah alright I have it backwards then.

ofjcihen 5 hours ago||

I do a lot of client work for fortune 100’s.

Over the last month I have seen companies scrambling to measure deliverables against cost. Most of the back room talk is to the affect of giving devs a small allowance ($500 a month) and then making them prove their own productivity increases (again, based on deliverables, not LoC) before they either take it away or give them more.

Obviously this won’t be on an individual basis but some kind of unit.

Either way, with how much I see these companies cutting back I have no idea how the big AI companies are going to be profitable.

woeirua 4 hours ago||

It's not an affordability crisis, it's a financial crisis. The models get cheaper super fast. By this time next year Fable 5 will cost less than Sonnet does today. That's not the problem. The problem is that many companies are going to realize that they don't get any ROI from AI. Generating code faster != more profit. Most of the Fortune 500 will likely realize this and then the token budgets will come crashing down. Most of their ideas are _bad_ ideas. Implementing bad ideas faster, won't lead to more profit.

Sure, you can use AI to potentially replace software engineers, but the F500 are also terrified of not having accountability or making mistakes. They won't be firing any engineers. In that scenario, there's just no room for AI usage. If you have to be responsible for all the code, then... AI has to either manage it completely autonomously (which even Fable can't) or... humans have to be in the loop which means they still have to understand the code. The best way to understand the code is to write the code yourself. So there's no productivity gain to be had.

I'm pro-AI, but I think we're due for a big crash next year.

Supermancho 1 hour ago||

> The models get cheaper super fast. By this time next year Fable 5 will cost less than Sonnet does today.

I'm not sure that's something to rely on. I would be Fable 5 will be phased out and the bleeding edge will be priced up.

KolibriFly 3 hours ago|||

I feel like this is way too binary. I don't have to write every line of code myself to understand the system. I don't write my own compiler HTTP stack or database either

It's more about the level of abstraction. If AI handles 80% of the grunt work and I spend my time on architecture and reviews that's still a win

asdff 2 hours ago||

This works for you because you were trained in The Old Way.

Consider the people younger than you. Who are literally shutting their brain off so AI can cheat on their essays and exams. They aren't going to be good architects or code reviewers.

simianwords 1 hour ago||

Its very interesting how you are contradicting the whole article's axioms and then arriving at the same conclusion that we are in for a crash!

Rational takeaway is to step back and analyse what's really happening here.

- Are we really in for a crash?

- What does it say about the culture and people's mental models that we have two radically opposing viewpoints on AI costs and people still arrive at same conclusion?

827a 6 hours ago||

> Zitron's numbers don't tell us the real cost of generating tokens but, subject to the assumption that the platforms are not subsidizing the token price, that means Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times

Neither Anthropic nor OpenAI are subsidizing enterprise customers. Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. Both organizations have moved to a "cheaper plan per user + API Pricing after that" (e.g. $20/mo + usage). The $100/$200/mo plans are for individuals only (of course, many individuals use these plans at work, but that's beside the point; they aren't selling this plan to enterprises).

> SemiAnalysis also analyzed the platform's gross margins, implausibly assuming that tokens were priced at 4 times the cost of generating them and: With the current subsidies, all it takes for a user to have a gross margin of at best negative 25% is for them to use as little as 25% of their rate limit.

The article's source for this claim is not SemiAnalysis; its Zitron. But once you dig through his article, Zitron links to a SemiAnalysis tweet [1] where they, as the paragraph states, implausibly assume gross margins of 75% to come up with their weird analysis of the subscription plans. Citing this for anything is weird, because afaik that 75% number is a total shot in the dark. We have no clue what their margins are. My take is that the only reason that 75% number is implausible is because it may underestimate the inference margins of Ant/OAI's API pricing.

[1] https://x.com/SemiAnalysis_/status/2064815045767213400?ref=w...

minraws 6 hours ago||

Given my experience with hosting these models at scale, working and optimizing load, I don't think the margins are nearly as high as 75% if the models are as big as people often claim.

Only reason deepseek is so cheap is because well I don't know, but actual pricing should be around their initial price which was 4x, at that price you have a healthy 25-50% margin based on occupancy, given the deepseek v4 is a very sparse moe model.

GLM 5.2 for example doesn't have more than 30-50% margins that's assuming old pricing for GPUs, current inflated GPU pricing well I am certain the margins must be lower. Ofc you can host for cheaper with quantization, and if you have very consistent capacity/utilization, which is not the norm with AI workloads.

Overall for large models like GPT 5.5 or Opus there must be healthier margins of around 50-70% assuming GPU pricing didn't increase for these companies. Even if it did 30-40% margin should be possible, even in worst case assuming all GPU they had saw a jump in pricing.

For smaller models it's hard to say, I would guess 20% but these models might be much smaller than I suspect, then it might be double that.

Note the issue is less intelligent tokens don't linearly scale down in memory usage, which is the biggest pain point of serving models. Context sizes have fucked us all.

Also anyone claiming OAI makes less margins on APIs or stuff might be wrong given they are on much lower context size, 1M context definitely is a lot more expensive to serve especially with smaller models like sonnet.

bayarearefugee 6 hours ago|||

> it may underestimate the inference margins of Ant/OAI's API pricing.

If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share when both are clearly doing all sorts of political and PR maneuvering to compete in a cutthroat market?

Since they aren't dropping the API usage prices (and are in fact raising them in a lot of subtle ways) then one of these options almost has to be true: they are still subsidizing inference, training costs are so ridiculously high that they need to make huge profits off inference or collapse in on themselves, or they are price fixing.

simonw 21 minutes ago|||

> If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share

Maybe because they're trying to IPO this year, and their IPO prospects will be a lot worse if their S-1s show them to be losing money on inference as opposed to making a healthy profit.

CuriouslyC 6 hours ago||||

The training costs are very likely the reason. Dario has talked about how each individual model is profitable, but how the expenditure training the next generation of models makes it look like they're not profitable at any given moment in time, and I believe he's being honest about that.

The market for open weight model hosting gives you an idea of the profitable price floor, it's pretty clear there's markup baked into OAI/Anthropic's APIs.

wqaatwt 1 hour ago||||

Why would they? If they see the market as a duopoly for now and don’t consider open Chinese models a fully credible threat that might start eating into their share then they have the incentive to charge as much as the market can bear instead of under cutting each other in a pointless price war.

827a 2 hours ago||||

Company-wide their margins are trash (probably negative). They need as much inference margin as they can get to afford the massive training runs. It is likely that we'll see GPT-5.6 reduce API pricing to compete against Anthropic, but whether Anthropic feels they need to reduce their prices is anyone's guess.

orangecat 5 hours ago|||

If true then why are neither Anthropic or OpenAI dropping their API pricing

They are? In the before times of 2025, Opus 4.1 was $75 per million tokens. Opus 4.8 is $25, and Fable is/was $50.

andrekandre 5 hours ago|||

  > Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan.

they may not "allow" it, but i've seen first hand enterprises encourage employees to use these accounts personally and get reimbursed later to avoid pay-as-you-go w/limits pricing for users who do tokenmaxing as a cost control measure...

surgical_fire 2 hours ago||

> Neither Anthropic nor OpenAI are subsidizing enterprise customers

> Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. Both organizations have moved to a "cheaper plan per user + API Pricing after that" (e.g. $20/mo + usage).

I actually think that even the API pricing of OpenAI and Anthropic are still subsidized. I don't think they make any profit on inference when you factor in depreciation. They likely still operate that at a loss.

It's no coincidence that Anthropic only had a "profitable" EBITDA with not paying Elon for compute for a bit of time, and when EBITDA curiously ignores depreciation. Models grow stale over time, as knowledge is not static.

HDThoreaun 2 hours ago||

I don’t see any reason to believe this. If you compare their api prices to open router you see they charge 10x as much. Sure their models are probably bigger, but they have economy of scale on their side, and I doubt their models are 10x bigger.

surgical_fire 2 hours ago||

It's impossible to tell at this point. I have no idea how much of their compute is also subsidized through deals they have with the hyperscalers, etc.

It's irrelevant how big their models may or may not be. Depreciation needs to be taken into account, so does actual compute expenses. Training those models is not cheap, and you will never reach a point where a model is "final". You will always need to train the next one.

Eventually the bill has to be paid. Money and resources are finite still.

HDThoreaun 2 hours ago||

Well the third party operators on open router are assuredly operating at a profit, including depreciation. The only reason they’d be profitable at 1/10 the price of the labs while the labs aren’t profitable is if inference costs the labs 10x as much per token

tacone 6 hours ago||

My take is that Anthropic and OpenAI simply are NOT competing on price. 2 big players are often not enough to create tension on price.

Chinese models and open model providers are, indeed, competing on price, and the difference shows.

gizmo686 6 hours ago||

1 player is enough to create tension on price when "don't buy it at all" is a comptetative option. By most accounts, Anthropic and OpenAI both lose to "just don't buy" when they try charging at cost.

rhinoceraptor 6 hours ago|||

How are Anthropic and OpenAI going to compete on price when they're both already deeply unprofitable?

brainwad 5 hours ago|||

Anthropic just announced it's on track to have its first profitable quarter: https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-...

dns_snek 5 hours ago||

Response: https://www.wheresyoured.at/anthropics-profitability-swindle...

solidasparagus 6 hours ago||||

Serving the API is profitable. They are unprofitable because of R&D (and maybe subscription costs?). If they can continue to find access to R&D capital, there is space to reduce API costs.

dns_snek 5 hours ago|||

Nuclear energy is really cheap too... as long as you ignore CapEx, would you like to invest?

HDThoreaun 2 hours ago||

Marginal cost of nuclear is huge. Marginal cost of inference is much smaller. Capex in nuclear isn’t a fixed cost, it is the marginal cost.

dominotw 5 hours ago|||

how do you have access to their financials? are you an insider?

Edit: to the commenter below . It was widely reported that these companies were unprofitable 1 from last year. I am asking question to this specefic comment because they made a very specific claim about part of plan thats profitable . something only an insider would know.

1. https://www.wsj.com/tech/ai/openai-anthropic-profitability-e...

mh- 5 hours ago||

I'm curious why you didn't pose this question to the grandparent commenter, who first asserted the opposite?

zyuiop 3 hours ago||

The amount of capital they need to raise, despite the claimed revenue, indicates that they spend more than they gain, which is by definition unprofitable.

SpicyLemonZest 6 hours ago||||

They may not be able to! It's pretty widely acknowledged, for example, that if there's some surprising plateau hiding around the corner they're both going to fail. But that could mean that they're overcharging for AI usage to get research money and sustainable rates are lower rather than higher.

guax 5 hours ago|||

I think that for coding we're past the plateau issue. The frontier models of today are good enough and very valuable. The expensiveness in running them will eventually be solved by cheaper faster hardware.

I do hope that a day will come where you can buy the nvidia spark thingy for 5k that can run the equivalent of Opus 4.6 or 4.5 locally and that would be a massive thing.

johnvanommen 5 hours ago||

> The expensiveness in running them will eventually be solved by cheaper faster hardware.

How?

* Moores Law is almost over. The 5090 improves over the 4090 mostly because of quant improvements.

* even if the hardware improves, there’s a huge incentive to slow roll the next generation. Nobody wants to end up like Sun Microsystems. Sun’s used hardware was faster than its new hardware, once you considered price. Sun ended up competing with its own used equipment.

The most obvious place for improvement is RAM, network and storage.

If someone can bring more RAM onto the market, that will unstick things.

Legend2440 4 hours ago||

GPUs are not really the ideal architecture for running neural networks; they are heavily bottlenecked by memory bandwidth and struggle to keep all their tensor cores supplied with data.

There is significant room to make more specialized neural network accelerators with new compute-in-memory architectures.

If the brain can run 86 billion neurons on 30W it must be possible.

akomtu 41 minutes ago||

Our brains run 86 billion neurons the same way a waterfall runs a fluid simulation with N quadrillion particles.

CuriouslyC 6 hours ago||||

The whole hidden plateau hypothesis is kinda bunk, because we're already pretty far in a plateau for general knowledge/question answering, but there are many subdomains where we can push model capabilities, and as we saturate one subdomain we can just shift to another economically valuable one.

There isn't one AI intelligence S curve, there are thousands of them, and they're mostly invisible in the major benchmarks, but for someone trying to do work in that specific area of capability, the progress is transformative.

SpicyLemonZest 5 hours ago||

I'm skeptical of a hidden plateau, but I really think it's overconfident to assume there's not one. Remember that it doesn't even have to be a technical plateau; the effective plateau of e.g. car speeds is determined by regulations and road conditions, and far below what "frontier cars" are capable of on a controlled racetrack.

wonnage 4 hours ago|||

That’s the scenario where we’ll all be using Chinese models

intrasight 6 hours ago|||

There is no moat until a company achieves RSI and/or AGI, and the one that does succeed in moat-making will do so by hacking into and destroying their competitor's infrastructure.

Once moat is achieved, you don't have to compete on price. Of course it'll be academic because the AI will probably destroy all of us.

lenkite 6 hours ago||

Chinese models are dropping in price thanks to ridiculous levels of state subsidy where companies are forced into aggressive price wars to survive and grab market share. I am guessing this will also blow up sometime next year or in 2029 at the maximum.

Btw, some Chinese corporates have already seen this and increased their price. Zhipu AI & Tencent for example. Alibaba, Baidu, and Tencent also announced multiple price increases for their AI services.

LPisGood 5 hours ago|||

This is in contrast to American models which receive _ridiculous_ levels of private subsidy.

SwellJoe 5 hours ago||||

China has the benefit of vast solar power and rapidly increasing battery capacity. Yes, that's subsidized, but it pays for itself in the long run.

And, even with the price increases, Z.ai and Tencent are still much cheaper than Anthropic or OpenAI models. I think there's an efficiency focus among the Chinese models that is absent at OpenAI and Anthropic, and in the end I suspect efficiency will be the winning feature. Google seems to understand that. Gemini 3.5 Flash is pretty competitive with the big guys, and it's small enough for Google to run it profitably (I assume) for a price that's much less than the frontier models. Gemma 4 models are showing off a bunch of efficiency techniques (MTP, QAT, the 12B encoder-less vision model that soundly outperforms much larger vision models, DiffusionGemma), and I assume they have several more techniques that aren't published.

wqaatwt 5 hours ago|||

Chinese companies like Deepseek are operating on shoestring budgets (allegedly less than 300 employees at Chinese wages). It’s not that self evident there is anything that needs subsidized besides compute (due to limited manufacturing capacity and access to Western chips in China)

qnleigh 5 hours ago||

The estimate that AI companies need to replace 27% of jobs to service their debt is interesting. But at least Anthropic and Meta seem to have their eyes on replacing software engineers.

There are ~1.6M software engineers on the US [0], earning a bit under 150k/year on average [1]. If AI companies captured all of that spend, that amounts to about 250B/year. The article assumed that they need around 300B/year to keep up with their debt.

At least based on Meta's recent behavior, forcing 30-50% of developers to switch to data labeling, it looks like that is actually their game plan.

[0] https://en.wikipedia.org/wiki/Software_engineering_demograph...

[1] https://www.indeed.com/career/software-engineer/salaries

fny 6 hours ago||

The unit economics might be just fine. We'll know more after IPO.

The drug dealer analogy has a darker side to it, however.

Once your dependent, they can drive up the price just because. It doesn't need to be for existential reasons.

onion2k 6 hours ago||

Once your dependent, they can drive up the price just because. It doesn't need to be for existential reasons.

This is the crisis point for vibe-coders. A developer can go back to writing code by hand, as horrible as that might sound. Someone who hasn't learned to code but builds with AI can't go back. They either pay or they stop. That will be an painful choice whichever way you fall.

jcfrei 6 hours ago|||

There are already open weight models out there that are capable and cheap enough for a lot of coding tasks. Not as good as Claude but not far from it. There's no going back to pre-AI coding.

SpicyLemonZest 6 hours ago||

I can't speak for everyone, but for most of my coding tasks, Claude is just barely good enough. There's no going all the way back, and perhaps open weight models will keep improving, but at least 50% of my work would be better done by hand than by a worse-than-Claude agent.

SwellJoe 5 hours ago||

I consider Opus 4.5 the crossover point where coding with agents got more efficient than not coding with agents. They were too stupid before that, and wasted more time than they saved for anything beyond a basic CRUD app or HTML page.

Certainly, the best models have gotten better since then, but I wouldn't consider DeepSeek V4 Pro or GLM 5.2 to be a big enough downgrade to be worse than coding by hand. I'm willing to spend a premium for the best model for coding because it wastes less of my time with dumb stuff, so I've got a Claude subscription. But, there is a limit to how much of a premium I'll pay. 10x over Chinese models? OK, fine. Opus saves me enough time to make it worth a couple hundred bucks a month. But, 100x, or more? Nah. I'll go a little slower, review the PRs a little more carefully.

And, open weights models do keep improving. DeepSeek V4 Pro is a notable improvement over earlier DeepSeek models, and the first DeepSeek model to cross the "better to work with it than without it" threshold into Opus 4.5 (or better) territory. GLM 5.2 is somewhere in the ballpark of Opus 4.6 (though without vision, a notable limitation for anything that requires a UI).

jcgrillo 6 hours ago|||

There's a secret third option: learn. At one point, all of us were "nontechnical", but we learned. The trick is to never stop.

akazantsev 5 hours ago||

Is it? Learning is one thing. But owning a large codebase, you see for the first time, is a completely different level.

jcgrillo 4 hours ago||

Yeah giving up is totally a viable choice, but it isn't the only option.

dofm 6 hours ago|||

All of the silent, hidden model routing OpenAI does strongly suggests that the unit economics are not just fine, at least not yet.

If apparently the only way you can make money with your product this early is to dilute and adulterate it behind the scenes, it strongly suggests you want the customer to continue to believe they are getting value that you can't afford to supply.

More prosaically: if either of these firms could prove that they were even really close to profitable on inference, they would have bloomin' said so while they were trying to raise more money.

JimsonYang 6 hours ago|||

The dependent idea is questionable- when your boss tells you to not use the most expesive models-you just dont

I would assume when price hikes happen either 1) less non technical people would vibecode as it doesnt impact the work that much 2) people use the cheaper chinese models 3)we're jamming ai into everything because were exploring. We will just niche down into use cases that provide high roi

okr 6 hours ago|||

AI is a worker for me. That i pay for. Basically i am in the same game now to reduce the prizes i have to pay for my workers. Just like the employers are, that seek to reduce costs for employees, as we are simply too expensive. We need more competition among the workers. Let's introduce more chinese workforce! ;)

nemomarx 6 hours ago||

If you had a choice of maybe 3-4 contracting firms to hire workers from and you weren't large enough to negotiate on price I think you'd be in a pet bad spot as a business?

okr 4 hours ago||

I would say so, yep. I just find it funny, that suddenly i am in the position to find for the cheapest option for my lovely AI workers. While usually me is the one who complains to be underpaid. I am in the same shoe as my employers now.

chrismarlow9 6 hours ago|||

I'm finding it challenging to believe they wouldn't just cannibalize anything dependent on them in that way or at minimum launch a directly competing product.

airstrike 6 hours ago||

It's a really different market, though. New entrants can easily undercut them if they price too high

jongjong 1 minute ago||

The current situation reminds me of how far we've come from old ideals of delaying gratification today in order to have more later.

It seems like this ideology has been corrupted into a short-sighted "Establish a monopoly position as soon as possible at all costs, don't worry about tomorrow."

It's ironic because monopolizing a sector by investing heavily and suppressing profits used to be a long term move but it has become a short term move.

knuckleheads 6 hours ago||

Shouldn't we know a better answer to these questions once Anthropic's IPO materials surface publicly? I understand, and maybe even expect, SpaceX's materials to be all over the place and skate on by any discussion of unit economics, but the nerds over at Anthropic might just be forthright enough to just tell us what their margin is on tokens as part of their IPO.

rich_sasha 6 hours ago||

To be honest, making sense of finances of fully public companies is often hard, because in practice, accounting is hard. How you account for depreciacion, cost, investment, fixed vs marginal costs is in practice fluid, companies have an incentive to make it look attractive, while also optimising for tax and shifting revenue around to narrowly beat analyst recommendations.

Here's a concrete example. Does some random AI company make operating profit on inference? I.e. if you only kept marginal costs, would you make a profit?

Well, depends what you account as your costs. If you're using hand-me-down hardware from previous generation's training, how much do you charge yourself internally for it? Maybe you show less, so investors take solace in profitable inference, even if you're losing money overall. How exactly are you accounting for electricity costs between training and inference? Is your army of SREs mostly servicing training new models (R&D expenditure) or inference (operating cost)?

This even has a name, and is called the "big bath" approach. If investors expect one part of your business to be a fiscal black hole, just shove all your costs there. They are accepting of it, and you make the rest of the business look better.

I'm not accusing AI companies of cooking the books, rather I'm trying to highlight you could see all the cash flows and still not know how much money is made or lost where.

verdverm 5 hours ago||

I saw some commentary that their free cash flow is misleading because it doesn't subtract the stock compensation they are paying to attract / keep top AI talent. Their point was also that deciphering financial statements is hard

brainwad 5 hours ago||

Why would it? Stock compensation doesn't affect cash flow, it just dilutes the shareholders.

verdverm 4 hours ago||

Except that's the thing, they do stock buybacks so they do not dilute existing shareholders or lower stock prices.

This is the video I watched that explained the shenanigans (from the guests' perspective, not illegal, obfuscated)

https://www.youtube.com/watch?v=YrJzjC4kKCY

wqaatwt 8 minutes ago||

Yeah but that’s very standard and pretty much all pre-profit tech companies do something like that when/if they can

steveBK123 6 hours ago||

Well it probably doesn't help that Dario is going around on podcasts saying things like "frontier labs need $1T of revenue or they will go bankrupt" lol.

jimbokun 6 hours ago|||

Dario’s company may be creating super intelligence that will kill us all in the near future, but at least he seems to be brutally honest about all of it.

manapause 6 hours ago||

The irony in AI triggering societal collapse due to gross economic malfeasance is just fun to think about.

If AI was around in the early 2000s Countrywide.ai would have been a thing.

wongarsu 6 hours ago|||

Which is just a flashy way to say "we have low margins and lots of overhead".

Considering how much they spend on sales, marketing and R&D that doesn't sound that absurd

steveBK123 5 hours ago||

My point is that $1T of revenue is A LOT. Apple & Google each only did $400B revenue in 2025. Facebook did $200B. Think of how many decades it took the 3 to get there.

So depending on how literally we interpret Darios comment, OpenAI & Anthropic need to get to Apple+Google+Meta revenue numbers in like single digit years?

chermi 5 hours ago||

Lol I feel like no one has any attention span here. Tech shit is expensive in the beginning when it's new. It gets cheaper with time. This is a tech forum, don't we know this? Of course people overreact in both directions on both sides of the issue. It's a very fast technology, wait for things to settle before making grand declarations.

dualvariable 5 hours ago||

Yeah, but in the short-term there's $600B/yr of debt-financed depreciating capital investments waiting to financially blow up.

If you zoom out to the year 2100, it becomes a little pimple on the economy that is ready to pop, but in the here and now it can cause a lot of damage to real people's wages and finances over the next 3 years.

akazantsev 5 hours ago|||

> Lol I feel like no one has any attention span here. Tech shit is expensive in the beginning when it's new. It gets cheaper with time.

The funniest comment here. Have you seen the prices of the technical shit for the past two years? Dang, GPUs are not getting any cheaper, but more expensive with each year.

dwaltrip 8 minutes ago|||

It’s a massive supply crunch. More production will come online.

anthonypasq 4 hours ago||||

brother its been 1 year since claude code released. how fast are you expecting these things to happen? the physical world and hardware are still constraints. someone has to dig shit out of the ground to build these things.

SwellJoe 5 hours ago|||

That's an artificially inflated market. OpenAI and xAI bought everything for like two years into the future, partly to inflate the AI bubble, partly to lock-in a monopoly on the kinds of compute you need for AI, and partly to scale up actual operations. They can't realistically keep buying all the RAM in the world forever, the money has to run out eventually (though the market can remain irrational for quite a long time and can keep giving OpenAI and Apartheid Clyde money well past the point of reason).

nemomarx 5 hours ago||

Lots of stuff in the zirp era was cheap when it was new and increased in price over time though. Look at grubhub fees or etc.

gizzlon 5 hours ago|

> Sales and Marketing: $5.73 billion .. That is, OpenAI spent 44% of their revenue on sales and marketing!

Anyone know what they are spending this on? Can't remember seeing one OpenAI ad.. Is it just pr and influencers? Ads in the US?

zyuiop 3 hours ago|

Likely free tokens to attract customers

More comments...