Posted by simonw 5/27/2026
You think this is fantastic deal only because they use similar like tricks where they inflate the price and tell you something supposed to cost $1000 but they have this today promo for $100.
I was there too and paying for a while. Few weeks ago I tried DeepSeek V4 Pro - expected its gonna be shit but its actually pretty good.
The deal is I pay daily ~$1 for DSV4-pro for ~100M API token usage. And they probably not getting broke because >90% of those token in practice is cache read and they very well optimized for that.
So ballpark same price per parameter as Simon.
So many startups trying to automate sales, but somehow the two biggest frontier labs have decided that the best GTM strategy is firmly human-in-the-loop.
the economics simply don't work unless you make six figures, at least to just give it a go blindly. the providers are also still figuring out what they can get away by charging, and they are getting a similar treatment from those under the stack.
the caps and limits are not very transparent, and it is quite difficult to know what is "enough". the current rate does not stay the same and the contract is changed way too often to dedicate for the long term. regardless, the subsidized rates should not be sustainable forever. make hay while the sun shines i suppose.
However the valuations are still far far away from actual sanity
I use glm-5.1 and occasionally deep seek v4.
They are as good or better than Claude's latest models.
And significantly cheaper. I've converted 3 of my engineer friends as well. All three have dropped their $200 month plans they had with anthropic.
We've all been a bit shocked at just how good these models are now.
If you "have" tried GLM (I specifically find it shockingly good for code). Did you not think it's not competitive to Claude, and why?
It's good enough for personal stuff. It doesn't compare to the latest Opus I use at work. You can certainly argue I don't need Opus for work, but there is clearly a difference.
Also, at least with z.ai, GLM-5.1 is s l o w! After using Claude at work, I get really impatient with GLM-5.1 at home. When doing "true" vibe coding (i.e. not really examining the code), Opus is a ton faster (easily 5x).
But yeah, I'm not willing to personally pay for the frontier models. I won't even renew my annual Z.ai plan - it's become too expensive.
Also, and I know you may not want to answer. But could you give me an idea of the type of thing you found glm to be worse with?
I think I've been fairly unbiased in testing a bunch of different development tasks. But am curious if maybe it performs well for some stuff and not others. So if you could share what you feel it's worse at.
Also are you an experienced developer or less experience?
When DeepSeek V4 Pro came out, I had been mostly coding with GLM-5.1 on a Z.ai coding plan.
I had a large analysis task on a relatively complex codebase. I decided to try the models out.
GLM-5.1 did acceptably but got a few things wrong (easily corrected) and took quite a while to get there.
Opus 4.6 burnt through the US$10 budget I had given it in about 10-15 min, without ever returning from the first prompt.
DeepSeek V4 returned a full analysis within 2-3 min, and I carried on all the way to implementing the feature I was after. Total cost less than US$1.00.
I now mostly alternate between GLM-5.1 and DeepSeek V4 Flash, with an occasional dip into V4 Pro for more complex analyses.
right now everyone is using latest and greatest to do dumb stuff like that. that would change fast if companies start caring about costs.
Any org with more than 150 users aren't on $200/month plans, they are forced into API pricing + $20/month/user
For individuals and orgs small enough to get to use the subscription plans, that's all well and good until usage limits keep going down, or cost goes up. If you compare the usage you get on $200/month maxed out vs. what that would cost at API pricing, the $200/mont plan is an absolute steal. I doubt it will last long.
On the plus side, I'm happy I'll have a nice hay barn when the local half-built AI data center is abandoned.
Recent conversation here on that topic: https://news.ycombinator.com/item?id=47062534#47063134
> I think this is often a mental excuse for continuing to avoid engaging with this tech, in the hope that it will all go away.
Thanks for that psychological explanation. I was wondering why people were simply ignoring the math that shows that inference at API pricing can be quite profitable, e.g. published here for DeepSeek V3/R1 with 545% profitability: https://github.com/deepseek-ai/open-infra-index/blob/main/20...
But I also think that their API token pricing represents a real margin over the inference costs for serving those tokens.
Both things can be true at once.
But that's the point of the article. Enterprise plans are starting to get API pricing, not the subsidized subscription pricing.
why would enterprises do that if they can just use bedrock or vertex?
Just imagine how funny it will be if it comes out that big labs were doing some fancy maths to count the 2k$/month in their forecasts ...