Posted by simonw 5/27/2026
More specialized products will consume tokens but their builders will be incented to optimize token use and switch models as costs and capabilities change. And if search engines become more AI capable, and Google is clearly striving for this, then they may have pressure from two sides that could squeeze the number of use cases for AI chat. AI coding isn't going anywhere and nor is the need for AI in general but I wonder if the products will have to evolve significantly to maintain the current levels of PMF. And then there's the question of profitability...
The impact of AI in other fields seems to be muted.
Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.
Spotting errors in a research report or legal brief is a whole lot harder!
But... non-software professionals spend a huge amount of their time on tasks that can be safely automated - reformatting documents, extracting numbers from PDFs, all kinds of flavor of data entry.
Learning how to use a tool like Claude Cowork can take a big dent out of those.
Do we not care about code quality, maintainability, performance, extensibility, or understandability anymore? Honest question, not a gotcha, it's just previously getting software to pass all the tests was a small part of what we would consider "working" or perhaps "good" software. Maybe that's different now with LLMs, idk. Maybe we need automated checks for these things as well, like not compiling until the code quality is good enough to let the agent finish it's loop.
Yes, we should care. I've been writing a whole book about that: https://simonwillison.net/guides/agentic-engineering-pattern...
I suspect that once the technology has been tamed and the hardware and software has been commoditized, the impact will be much less dramatic than we expect and we will realize the importance of a shared vision, experience, taste, intuition and discernment in building good products.
it is only true for USD. for example if you pay in euro, this is actually more expensive. kind of makes no sense, because it translates to $1 = €1
It is quite trivial to switch from using one model or another. Likewise, in a few years we'll have affordable laptops to run today's frontier models.
What's their plan to let us keep subscribing?
How about letting you maintain a vibe-coded repo only with access to the context that led to it ?
How many tokens is that, input/output-wise?
(a) I'm curious if you feel like you got $2000 worth of value out of them in the last month?
(b) I'm also curious if you would have gotten similar quality out of a slightly lower-cost provider of an open-weight model? (e.g. Kimi K2.6 and DeepSeek v4 Pro) and what the spend would have been for that.
I myself have managed to spend not quite $4 on OpenRouter and have felt it was very worth it; I just have much smaller, or more targeted requests I guess. (Lately, adding features to a static site generator in Python, or setting up log forwarding via a docker compose file)
Input tokens: 52,545,485
Output tokens: 5,767,253
Cache create tokens: 5,112,029
Cache read tokens: 1,475,069,465
Total tokens: 1,538,494,232
Total cost: $1,199.79
OpenAI Codex: Input tokens: 52,598,013
Output tokens: 4,681,867
Reasoning output: 2,091,063
Cached input tokens: 1,153,844,864
Total tokens: 1,211,124,744
Total cost: $980.37
I'm confident I got value out of OpenAI - I've been mainly on Codex for the last few weeks.Not so sure I got that value from Claude, just because I've been using it a lot less and somehow the price came to about the same as OpenAI.
Given the code I've been able to build in the past month I genuinely do think I got value for the API price version, and (don't tell OpenAI or Anthropic) I think I'd have paid full price.
I've not spent nearly enough time with GLM-5.1 and co to compare, but I do know that the prompts I'm using with the agents are not prompts I would have expected to work just three months ago.
When I account for the amount of time it saved me there's no question $2,000 was worth it.
Personally, I've probably spent $60 or so on OpenRouter in the last month or so and got a working project out of it that it would probably have taken me a fortnight to knock together (which is inevitably an under-estimate because it covered things I'd have to learn but K2.5/6 already knew). There's an orders-of-magnitude gap there.
I'm building a product right now with some AI coding (despite my negative sentiment about AI in general they are useful). I am both the product person and the engineer, and I'm pretty decent at using it, so according to the hype I should be seeing like a 10x speedup. I am not seeing that. It's definitely faster, but there are also days where I'm stuck cleaning up things after going too fast for too long, or periods where I need to put the software in front of people to get real feedback, or even periods where I just need to use it extensively myself to find the pain points and bugs. I just don't see this "running circles" once you get past an MVP and you actually need to build something secure and not embarassingly broken.
If not lower priced chinese offerings will be better as its cheaper per token - giving you more attempts to offset the variance.
My feeling on the former is no... I believe they tried really hard but they've settled on pure marketing now to attempt to fight off the chinese with perceived superiority in quality.
Firstly, if the user is asking for things where AI can link to products or services to buy, there's a very good relevancy, much higher than in other types of ads.
Secondly, since the AI often takes time to compute answers to user's questions, they could be shown ads while waiting. People could perhaps be less annoyed by this than some other commercials since they know the break has to be there anyway.
(First idea is something I came up when asking Claude to compare some products, or ask for help in lawn care. Second idea was by a colleague.)
I do agree with the author that these companies seem much stronger financially recently though.