For many "simple" LLM tasks, GPT-5-mini was sufficient 99% of the time. Hopefully these models will do even more and closer to 100% accuracy.
The prices are up 2-4x compared to GPT-5-mini and nano. Were those models just loss leaders, or are these substantially larger/better?
GPT-5.4-nano is less impressive. I would stick to gpt-5.4-mini where precise data is a requirement. But it is fast, and probably cheaper and better quality than an 8-20B parameter local model would be.
( https://encyclopedia.foundation/benchmarks/dashboard/ for details - the data is moderately blurry - some outlier (15s) calls are included, a few benchmark questions are ambiguous, and some prices shown are very rough estimates ).
Preregistering my predictions:
Mini: better than Haiku but not as good as Flash 3, especially at reasoning=none.
Nano: worse than Flash 3 Lite. Probably better than Qwen 3.5 27b.
OpenAI restructures after Anthropic captures 70% of new enterprise deals. Claude Code hits $2.5B while Codex lags at $1B ahead of dual IPOs.
Src: https://www.implicator.ai/openai-cuts-its-side-quests-the-en...
I think, no model, SOTA or not, has neither the context nor the attention to be able to do anything meaningful with huge amount of logs.