Posted by atgctg 12/11/2025
System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...
gpt-5.2 $1.75 $0.175 $14.00
gpt-5.1 $1.25 $0.125 $10.00 $25 - Opus 4.5
$15 - Sonnet 4.5
$14 - GPT 5.2
$12 - Gemini 3 Pro
Even if you're including input, your numbers are still off.Ironically at that input size, input costs dominate rather than output, so if that's the use case you're going for you want to be including those in your named prices anyway.
>Input:
>$21.00 / 1M tokens
>Output:
>$168.00 / 1M tokens
That's the most "don't use this" pricing I've seen on a model.
General intelligence has ridiculously gotten less expensive. I don't know if it's because of compute and energy abundance,or attention mechanisms improving in efficiency or both but we have to acknowledge the bigger picture and relative prices.
Pro barely performs better than Thinking in OpenAI's published numbers, but comes at ~10x the price with an explicit disclaimer that it's slow on the order of minutes.
If the published performance numbers are accurate, it seems like it'd be incredibly difficult to justify the premium.
At least on the surface level, it looks like it exists mostly to juice benchmark claims.
Essentially a newbie trick that works really well but not efficient, but still looking like it's amazing breakthrough.
(if someone knows the actual implementation I'm curious)
Makes me feel guilty for spamming pro with any random question I have multiple times a day.
I use Gemini 3 with my $10/month copilot subscription on vscode. I have to say, Gemini 3 is great. I can do the work of four people. I usually run out of premium tokens in a week. But I’m actually glad there is a limit or I would never stop working. I was a skeptic, but it seems like there is a wider variety of patterns in the training distribution.
I remain excited about new models. It's like finding my coworker be 10% smarter every other week.
Now we can create new samples and evals for more complex tasks to train up the next gen, more planning, decomp, context, agentic oriented
OpenAI has largely fumbled their early lead, exciting stuff is happening elsewhere
From what I understand, nobody has done any real scaling since the GPT-4 era. 4.5 was a bit larger than 4, but not as much as the orders of magnitude difference between 3 and 4, and 5 is smaller than 4.5. Google and Anthropic haven't gone substantially bigger than GPT-4 either. Improvements since 4 are almost entirely from reasoning and RL. In 2026 or 2027, we should see a model that uses the current datacenter buildout and actually scales up.
With FP4 in the Blackwell GPUs, it should become much more practical to run a model of that size at the deployment roll-out of GPT-5.x. We're just going to have to wait for the GBx00 systems to be physically deployed at scale.
https://www.levels.fyi/companies/openai/salaries/software-en...
The winner in this race will be whoever gets small local models to perform as well on consumer hardware. It'll also pop the tech bubble in the US.