I think Anthropic and OpenAI have found product-market fit

Posted by simonw 5/27/2026

I think Anthropic and OpenAI have found product-market fit(simonwillison.net)

1094 points | 1245 commentspage 4

jreynar 5/28/2026|

I may be biased because I work on an AI powered enterprise productivity product, but while I agree they have PMF right now, I wonder whether people's use will evolve in ways that undercut the current PMF. Chatting with an assistant is great if there's no product with tailored UI available that also has the AI capabilities. But once there is, I suspect people may switch, or more importantly enterprises may switch because they'll get the benefits of AI without the clunkiness of a chat only interface. We may see another DOS -> GUI-like shift.

More specialized products will consume tokens but their builders will be incented to optimize token use and switch models as costs and capabilities change. And if search engines become more AI capable, and Google is clearly striving for this, then they may have pressure from two sides that could squeeze the number of use cases for AI chat. AI coding isn't going anywhere and nor is the need for AI in general but I wonder if the products will have to evolve significantly to maintain the current levels of PMF. And then there's the question of profitability...

oxedom 5/28/2026|

MCP-Apps could be the solution for tailored UI.

rubiquity 5/27/2026||

I think it's fair to say they had achieved product-market fit when their revenues were growing deep triple digits month over month. What we're seeing now is that perhaps they have achieved profitability or at the least a more sustainable balance sheet.

spprashant 5/27/2026||

So it largely sounds like many more people will be able to write software - and will use AI to do it. Existing software engineers will continue to automate their tasks away like they always did, but perhaps at a faster rate.

The impact of AI in other fields seems to be muted.

simonw 5/27/2026||

I think it is applicable to a much wider range of knowledge work, but it's also harder to apply there.

Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.

Spotting errors in a research report or legal brief is a whole lot harder!

But... non-software professionals spend a huge amount of their time on tasks that can be safely automated - reformatting documents, extracting numbers from PDFs, all kinds of flavor of data entry.

Learning how to use a tool like Claude Cowork can take a big dent out of those.

slopinthebag 5/27/2026||

> Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.

Do we not care about code quality, maintainability, performance, extensibility, or understandability anymore? Honest question, not a gotcha, it's just previously getting software to pass all the tests was a small part of what we would consider "working" or perhaps "good" software. Maybe that's different now with LLMs, idk. Maybe we need automated checks for these things as well, like not compiling until the code quality is good enough to let the agent finish it's loop.

simonw 5/27/2026|||

> Do we not care about code quality, maintainability, performance, extensibility, or understandability anymore?

Yes, we should care. I've been writing a whole book about that: https://simonwillison.net/guides/agentic-engineering-pattern...

tracerbulletx 5/28/2026|||

What code quality even means is different now, but also LLMs are capable of producing better quality code at scale in my companies experience. We are able to in fact sort of propagate best practices and structure via the llm to all of the teams even when they're working under time pressure.

pianopatrick 5/27/2026||

If the AI can write code for robots the impact in other fields may be pretty large. Seems to me a lot of jobs can be automated with software and robots combined. The limit in the past was writing the software to get the robots to work. But if AI can remove that limit...

pandoro 5/28/2026||

If those companies found PMF we should expect to see the effects of all those tokens burned on the products we all know and use. In my own experience: yes new features can be delivered much more quickly (by orders of magnitude) but knowing what to build and how to evolve a product with taste and direction does not; so ultimately the user-facing gains are marginal.

I suspect that once the technology has been tamed and the hardware and software has been commoditized, the impact will be much less dramatic than we expect and we will realize the importance of a shared vision, experience, taste, intuition and discernment in building good products.

Szpadel 5/27/2026||

> but as far as I can tell those credit costs are an exact match for the API token costs listed for those models.

it is only true for USD. for example if you pay in euro, this is actually more expensive. kind of makes no sense, because it translates to $1 = €1

harrouet 5/28/2026||

Product-market fit, but what about customer retention?

It is quite trivial to switch from using one model or another. Likewise, in a few years we'll have affordable laptops to run today's frontier models.

What's their plan to let us keep subscribing?

simonw 5/28/2026|

Right now the main plan for that appears to be having those enterprise accounts commit for a year at a time.

harrouet 6/1/2026||

One-year commitment won save their churn rate without vendor lock-in.

How about letting you maintain a vibe-coded repo only with access to the context that led to it ?

NortySpock 5/27/2026||

"[would have spent] $1,199 with Anthropic, $980 with OpenAI"

How many tokens is that, input/output-wise?

(a) I'm curious if you feel like you got $2000 worth of value out of them in the last month?

(b) I'm also curious if you would have gotten similar quality out of a slightly lower-cost provider of an open-weight model? (e.g. Kimi K2.6 and DeepSeek v4 Pro) and what the spend would have been for that.

I myself have managed to spend not quite $4 on OpenRouter and have felt it was very worth it; I just have much smaller, or more targeted requests I guess. (Lately, adding features to a static site generator in Python, or setting up log forwarding via a docker compose file)

simonw 5/27/2026||

Claude Code:

  Input tokens:        52,545,485
  Output tokens:        5,767,253
  Cache create tokens:  5,112,029
  Cache read tokens: 1,475,069,465
  Total tokens:      1,538,494,232
  Total cost:        $1,199.79

OpenAI Codex:

  Input tokens:          52,598,013
  Output tokens:          4,681,867
  Reasoning output:       2,091,063
  Cached input tokens: 1,153,844,864
  Total tokens:        1,211,124,744
  Total cost:          $980.37

I'm confident I got value out of OpenAI - I've been mainly on Codex for the last few weeks.

Not so sure I got that value from Claude, just because I've been using it a lot less and somehow the price came to about the same as OpenAI.

Given the code I've been able to build in the past month I genuinely do think I got value for the API price version, and (don't tell OpenAI or Anthropic) I think I'd have paid full price.

I've not spent nearly enough time with GLM-5.1 and co to compare, but I do know that the prompts I'm using with the agents are not prompts I would have expected to work just three months ago.

NortySpock 5/27/2026|||

Cool! Thanks for the details, and your blog posts are usually interesting food for thought, so thank you for them too!

krupan 5/27/2026|||

Are you saying that the software you wrote using those tools generated enough revenue to cover the $2000?

simonw 5/27/2026||

Not yet, but that's because it was almost all open source and I'm really bad at generating revenue from that.

When I account for the amount of time it saved me there's no question $2,000 was worth it.

regularfry 5/27/2026||

If it were me I'd be asking "How long would it have taken me to do that, and what's the rate I'd have been charging for the work I would have been doing otherwise?"

Personally, I've probably spent $60 or so on OpenRouter in the last month or so and got a working project out of it that it would probably have taken me a fortnight to knock together (which is inevitably an under-estimate because it covered things I'd have to learn but K2.5/6 already knew). There's an orders-of-magnitude gap there.

aenis 5/27/2026||

The end game here is going back from a model where a bunch of product and tech management people sit in the U.S. or Europe, and try to manage thousands of mediocre talent sitting somewhere far away. The new model is you give those coding tools to good engineers colocated with your product people, and you ship good stuff much faster. If you can achieve such a setup, the token costs can be $50k per seat per month and you still run circles around the legacy IT models in terms of efficiency. Giving everyone the API keys and not changing the way products are managed is not going to work.

overgard 5/27/2026|

Good lord, what company would want to spend 600k per employee just to go maybe %20 faster (what the studies seem to show is a realistic estimate for productivity gains),

I'm building a product right now with some AI coding (despite my negative sentiment about AI in general they are useful). I am both the product person and the engineer, and I'm pretty decent at using it, so according to the hype I should be seeing like a 10x speedup. I am not seeing that. It's definitely faster, but there are also days where I'm stuck cleaning up things after going too fast for too long, or periods where I need to put the software in front of people to get real feedback, or even periods where I just need to use it extensively myself to find the pain points and bugs. I just don't see this "running circles" once you get past an MVP and you actually need to build something secure and not embarassingly broken.

grttq 5/27/2026||

To me the question is, can the frontier labs make the variance of output lower + make the output of higher quality to justify their prices?

If not lower priced chinese offerings will be better as its cheaper per token - giving you more attempts to offset the variance.

My feeling on the former is no... I believe they tried really hard but they've settled on pure marketing now to attempt to fight off the chinese with perceived superiority in quality.

Gravityloss 5/27/2026||

I see two really good ideas for monetizing the free tier for consumers.

Firstly, if the user is asking for things where AI can link to products or services to buy, there's a very good relevancy, much higher than in other types of ads.

Secondly, since the AI often takes time to compute answers to user's questions, they could be shown ads while waiting. People could perhaps be less annoyed by this than some other commercials since they know the break has to be there anyway.

(First idea is something I came up when asking Claude to compare some products, or ask for help in lawn care. Second idea was by a colleague.)

mchusma 5/28/2026|

If we define product market fit as profitable with a trillion dollar valuation, I think the term has lost its helpfulness.

I do agree with the author that these companies seem much stronger financially recently though.

More comments...