Top
Best
New

Posted by atgctg 12/11/2025

GPT-5.2(openai.com)
https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

1195 points | 1083 commentspage 13
tabletcorry 12/11/2025|
Slight increase in model cost, but looks like benefits across the board to match.

  gpt-5.2 $1.75 $0.175 $14.00
  gpt-5.1 $1.25 $0.125 $10.00
jtbayly 12/11/2025||
40% increase is not "slight."
credit_guy 12/11/2025||
Not the OP, but I think "slight" here is in relation to Anthropic and Google. Claude Opus 4.5 comes at $25/MT (million tokens), Sonnet 4.5 at $22.5/MT, and Gemini 3 at $18/MT. GPT 5.2 at $14/MT is still the cheapest.
deaux 12/12/2025||
Your numbers are very off.

  $25 - Opus 4.5
  $15 - Sonnet 4.5
  $14 - GPT 5.2
  $12 - Gemini 3 Pro
Even if you're including input, your numbers are still off.
credit_guy 12/12/2025||
I used the pricing for long context (>200k) in all cases. I personally use AI as coding assistants, like lots of other people, and as such, hitting and exceeding 200k is quite the norm. The numbers you are showing are for <200k context length.
deaux 12/13/2025||
I also use them as coding assistants among other things, like lots of other people, and hitting and exceeding 200k is absolutely not the norm unless you're using a large number of huge MCP servers. At those context sizes output quality significantly declines, even with the claims of "we support long context". This is why all those coding assistants use auto-compression, not just to save money, but largely to maintain quality. In any case, >200k input calls are a small fraction of all.

Ironically at that input size, input costs dominate rather than output, so if that's the use case you're going for you want to be including those in your named prices anyway.

commandar 12/11/2025|||
In particular, the API pricing for GPT-5.2 Pro has me wondering what on earth the possible market for that model is beyond getting to claim a couple of percent higher benchmark performance in press releases.

>Input:

>$21.00 / 1M tokens

>Output:

>$168.00 / 1M tokens

That's the most "don't use this" pricing I've seen on a model.

https://openai.com/api/pricing/

aimanbenbaha 12/11/2025|||
Last year o3 high did 88% on ARC-AGI 1 at more than $4,000/task. This model at its X high configuration scores 90.5% at just $11,64 per task.

General intelligence has ridiculously gotten less expensive. I don't know if it's because of compute and energy abundance,or attention mechanisms improving in efficiency or both but we have to acknowledge the bigger picture and relative prices.

commandar 12/11/2025||
Sure, but the reason I'm confused by the pricing is that the pricing doesn't exist in a vacuum.

Pro barely performs better than Thinking in OpenAI's published numbers, but comes at ~10x the price with an explicit disclaimer that it's slow on the order of minutes.

If the published performance numbers are accurate, it seems like it'd be incredibly difficult to justify the premium.

At least on the surface level, it looks like it exists mostly to juice benchmark claims.

rvnx 12/11/2025||
It could be using the same early trick of Grok (at least in the earlier versions) that they boot 10 agents who work on the problem in parallel and then get a consensus on the answer. This would explain the price and the latency.

Essentially a newbie trick that works really well but not efficient, but still looking like it's amazing breakthrough.

(if someone knows the actual implementation I'm curious)

anticensor 12/14/2025||
The magic number appears to be 12 in case of GPT 5.2 pro.
asgraham 12/11/2025||||
Those prices seem geared toward people who are completely price insensitive, who just want "the best" at any cost. If the margins on that premium model are as high as they should be, it's a smart business move to give them what they want.
arthurcolle 12/11/2025||||
gpt-4-32k pricing was originally $60.00 / $120.00.
wahnfrieden 12/11/2025||||
Pro solves many problems for me on first try that the other 5.1 models are unable to after many iterations. I don't pay API pricing but if I could afford it I would in some cases for the much higher context window it affords when a problem calls for it. I'd rather spend some tens of dollars to solve a problem than grind at it for hours.
reactordev 12/11/2025||||
Less an issue if your company is paying
rvnx 12/11/2025||
Even less an issue when OpenAI provides you free credits
Leynos 12/11/2025|||
Someone on Reddit reported that they were charged $17 for one prompt on 5-pro. Which suggests around 125000 reasoning tokens.

Makes me feel guilty for spamming pro with any random question I have multiple times a day.

llmslave 12/11/2025|||
They probably just beefed up compute run time on the what is the same underlying model
anvuong 12/11/2025||
In what world is that a slight increase?
lazarus01 12/12/2025||
My god, what terrible marketing, totally written by AI. No flow whatsoever.

I use Gemini 3 with my $10/month copilot subscription on vscode. I have to say, Gemini 3 is great. I can do the work of four people. I usually run out of premium tokens in a week. But I’m actually glad there is a limit or I would never stop working. I was a skeptic, but it seems like there is a wider variety of patterns in the training distribution.

iwontberude 12/11/2025||
I have already cancelled. Claude is more than enough for me. I don’t see any point in splitting hairs. They are all going to keep lying more and more sneakily.
jstummbillig 12/11/2025||
So, right off the bat: 5.2 code talk (through codex) feels really nice. The first coding attempt was a little meh compared to 5.1 codex max (reflecting what they wrote themselves), but simply planning / discussing things felt markedly better than anything I remember from any previous model, from any company.

I remain excited about new models. It's like finding my coworker be 10% smarter every other week.

qoez 12/11/2025||
This is also the exact on-the-day 10th anniversary of openai's creation incidentally
johnwheeler 12/11/2025||
I'm not interested in using OpenAI anymore because Sam Altman is so untrustworthy. All you see on X.com is him and Greg Brockman kissing David Sacks' ass, trying to make inroads with him, asking Disney for investments, and shit. Are you kidding? Who wants to support these clowns? Let's let Google win. Let's let Anthropic win. Anyone but Sam Altman.
riazrizvi 12/11/2025||
Does it still use the word ‘fluff’ in 90% of its preambles, or is it finally able to get straight to the point?
system2 12/11/2025||
"Investors are putting pressure, change the version number now!!!"
exe34 12/11/2025|
I'm quite sad about the S-curve hitting us hard in the transformers. For a short period, we had the excitement of "ooh if GPT-3.5 is so good, GPT-4 is going to be amazing! ooh GPT-4 has sparks of AGI!" But now we're back to version inflation for inconsequential gains.
verdverm 12/11/2025|||
2025 is the year most Big AI released their first real thinking models

Now we can create new samples and evals for more complex tasks to train up the next gen, more planning, decomp, context, agentic oriented

OpenAI has largely fumbled their early lead, exciting stuff is happening elsewhere

ToValueFunfetti 12/11/2025||||
Take this all with a grain of salt as it's hearsay:

From what I understand, nobody has done any real scaling since the GPT-4 era. 4.5 was a bit larger than 4, but not as much as the orders of magnitude difference between 3 and 4, and 5 is smaller than 4.5. Google and Anthropic haven't gone substantially bigger than GPT-4 either. Improvements since 4 are almost entirely from reasoning and RL. In 2026 or 2027, we should see a model that uses the current datacenter buildout and actually scales up.

Leynos 12/11/2025|||
4.5 is widely believed to be an order of magnitude larger than GPT-4, as reflected in the API inference cost. The problem is the quantity of parameters you can fit in the memory of one GPU. Pretty much every large GPT model from 4 onwards has been mixture of experts, but for a 10 trillion parameter scale model, you'd be talking a lot of experts and a lot of inter-GPU communication.

With FP4 in the Blackwell GPUs, it should become much more practical to run a model of that size at the deployment roll-out of GPT-5.x. We're just going to have to wait for the GBx00 systems to be physically deployed at scale.

snovv_crash 12/11/2025|||
Datacenter capacity is being snapped up for inference too though.
JanSt 12/11/2025||||
I don't feel the S-curve at all yet. Still an exponential for me
exe34 12/11/2025||
With a very long doubling time?
gessha 12/11/2025|||
Because it will take thousands of underpaid researchers random searching through solution space to get to the next improvement, not 2-3 companies pressed to monetize and enshittify their product before money runs out. That and winning more hardware lotteries.
astrange 12/12/2025||
Underpaid? OpenAI!? It's pretty good I think.

https://www.levels.fyi/companies/openai/salaries/software-en...

gessha 12/12/2025||
I’m talking about grad students, not OpenAI researchers.
jaimex2 12/11/2025||
They just keep flogging that dead horse.

The winner in this race will be whoever gets small local models to perform as well on consumer hardware. It'll also pop the tech bubble in the US.

MagicMoonlight 12/11/2025|
They’re definitely just training the models on the benchmarks at this point
roxolotl 12/11/2025|
Yea either this is an incredible jump or we’ve finally gotten confirmation benchmarks are bs.
More comments...