Posted by fugu2 3 days ago
I've been running CC with Qwen3-Coder-30B (FP8) and I find it just as fast, but not nearly as clever.
This is with my regular $20/month ChatGpT subscription and my $200 a year (company reimbursed) Claude subscription.
Right now OpenAI is giving away fairly generous free credits to get people to try the macOS Codex client. And... it's quite good! Especially for free.
I've cancelled my Anthropic subscription...
https://docs.z.ai/devpack/tool/claude
https://www.cerebras.ai/blog/introducing-cerebras-code
or i guess one of the hosted gpu providers
if you're basically a homelabber and wanted an excuse to run quantized models on your own device go for it but dont lie and mutter under your own tin foil hat that its a realistic replacement
And they do? That's what the API is.
The subscription always seemed clearly advertised for client usage, not general API usage, to me. I don't know why people are surprised after hacking the auth out of the client. (note in clients they can control prompting patterns for caching etc, it can be cheaper)
The API is for using the model directly with your own tools. It can be in dev, or experiments, or anything.
Subscriptions are for using the apps Claude + code. That's what it always said when you sign up.
LLMs are a hyper-competitive market at the moment, and we have a wealth of options, so if Anthropic is overpricing their API they'll likely be hurting themselves.