Top
Best
New

Posted by meetpateltech 1 day ago

GPT‑5.4 Mini and Nano(openai.com)
244 points | 143 commentspage 3
Rapzid 1 day ago|
Oh.. I thought maybe these would be upgrades to gpt-4.1 and gpt-4.1-mini and etc.. But the latency is way too high compared to the 400-600. Yeah, different models and etc but the naming is confusing.
powera 1 day ago||
I've been waiting for this update.

For many "simple" LLM tasks, GPT-5-mini was sufficient 99% of the time. Hopefully these models will do even more and closer to 100% accuracy.

The prices are up 2-4x compared to GPT-5-mini and nano. Were those models just loss leaders, or are these substantially larger/better?

HugoDias 1 day ago||
For us, it was also pretty good, but the performance decreased recently, that forced us to migrate to haiku-4.5. More expensive but much more reliable (when anthropic up, of course).
throwaway911282 1 day ago||
they dont change the model weights (no frontier lab does). if you have evals and all prompts, tool calls the same, I'm curious how you are saying performance decreased..
powera 1 day ago||
So far on my (simple) benchmarks, GPT-5.4-mini is looking very good. GPT-5.4-mini is about 30% faster than GPT-5-mini. GPT-5.4-mini gets 80% on the "how many Rs in Strawberry" test, and nearly perfect scores on everything else I threw at it.

GPT-5.4-nano is less impressive. I would stick to gpt-5.4-mini where precise data is a requirement. But it is fast, and probably cheaper and better quality than an 8-20B parameter local model would be.

( https://encyclopedia.foundation/benchmarks/dashboard/ for details - the data is moderately blurry - some outlier (15s) calls are included, a few benchmark questions are ambiguous, and some prices shown are very rough estimates ).

bananamogul 1 day ago||
They could call them something like “sonnet” and “haiki” maybe.
xyproto 1 day ago||
OpenAI has "open" in the name without being anything similar to "open source". Additionally, they have not rejected using their technology for automatically killing people and for mass surveillance. I deleted my OpenAI account, and it felt good. Recommended.
jbellis 1 day ago||
Benchmarking these now.

Preregistering my predictions:

Mini: better than Haiku but not as good as Flash 3, especially at reasoning=none.

Nano: worse than Flash 3 Lite. Probably better than Qwen 3.5 27b.

attentive 1 day ago|
Please post it here. I'd also like to know if 5.4 mini is better than Flash 3. Include reasoning and timing, if possible.
beernet 1 day ago||
Crazy how OAI is way behind now and the only one to blame is Sam, his ego and lust for influence. Their downwards trajectory of paying accounts since "the move" (DoW deal) is an open secret. If you had placed a new CEO at OAI six months ago and told him to destroy the company, it would have been hard for that CEO to do a better job at that than Sam did. Should have left when he was let go but decided to go full Greg and MAGA instead. Here we are. Go Dario
beernet 1 day ago|
Just to elaborate, as I am getting downvoted by tech bros:

OpenAI restructures after Anthropic captures 70% of new enterprise deals. Claude Code hits $2.5B while Codex lags at $1B ahead of dual IPOs.

Src: https://www.implicator.ai/openai-cuts-its-side-quests-the-en...

jerrygoyal 1 day ago||
Is GPT-5.4Mini drastically or marginally better for writing tasks as compared to GPT-5Mini?
simianwords 1 day ago||
why isn't nano available in codex? could be used for ingesting huge amount of logs and other such things
patates 1 day ago|
IMHO the best way is to let a SOTA model have a look at bunch of random samples and write you tools to analyze those.

I think, no model, SOTA or not, has neither the context nor the attention to be able to do anything meaningful with huge amount of logs.

machinecontrol 1 day ago||
What's the practical advantage of using a mini or nano model versus the standard GPT model?
aavci 1 day ago|
Cheaper. Every month or so I visit the models used and check whether they can be replaced by the cheapest and smallest model possible for the same task. Some people do fine tuning to achieve this too.
kseniamorph 1 day ago|
wow, not bad result on the computer use benchmark for the mini model. for example, Claude Sonnet 4.6 shows 72.5%, almost on par with GPT-5.4 mini (72.1%). but sonnet costs 4x more on input and 3x more on output
PunchTornado 23 hours ago|
what's the point of this benchmark if sonnet is working great at my tasks and mini can't solve my tasks?
More comments...