https://platform.openai.com/docs/models/gpt-4.1
https://platform.openai.com/docs/models/gpt-4.1-mini
https://platform.openai.com/docs/models/gpt-4.1-nanoBut the price is what matters.
Wait, wouldn’t this be a decent test for reasoning ?
Every patch changes things, and there’s massive complexity with the various interactions between items, uniques, runes, and more.
The lack of availability in ChatGPT is disappointing, and they're playing on ambiguity here. They are framing this as if it were unnecessary to release 4.1 on ChatGPT, since 4o is apparently great, while simultaneously showing how much better 4.1 is relative to GPT-4o.
One wager is that the inference cost is significantly higher for 4.1 than for 4o, and that they expect most ChatGPT users not to notice a marginal difference in output quality. API users, however, will notice. Alternatively, 4o might have been aggressively tuned to be conversational while 4.1 is more "neutral"? I wonder.
Does it, though? They said that "many" have already been incorporated. I simply don't buy their vague statements there. These are different models. They may share some training/post-training recipe improvements, but they are still different.
Vs in the API, I want to have very strict versioning of the models I'm using. And so letting me run by own evals and pick the model that works best.
Supposedly that’s coming with GPT 5.
They still have a mess of models in ChatGPT for now, and it doesn't look like this is going to get better immediately (even though for GPT-5, they ostensibly want to unify them). You have to choose among all of them anyway.
I'd like to be able to choose 4.1.
Broad Knowledge 25.1 Coder: Larger Problems 25.1 Coder: Line focused 25.1
@sama: underrated tweet
Source: https://x.com/stevenheidel/status/1911833398588719274
4.1 is 26.6% better at coding than 4.5. Got it. Also…see the em dash
gpt-4.1
- Input: $2.00
- Cached Input: $0.50
- Output: $8.00
gpt-4.1-mini
- Input: $0.40
- Cached Input: $0.10
- Output: $1.60
gpt-4.1-nano
- Input: $0.10
- Cached Input: $0.025
- Output: $0.40
I'm not as concerned about nomenclature as other people, which I think is too often reacting to a headline as opposed to the article. But in this case, I'm not sure if I'm supposed to understand nano as categorically different than many in terms of what it means as a variation from a core model.
gpt-4o-mini for comparison:
- Input: $0.15
- Cached Input $0.075
- Output: $0.60
I was using gpt-4o-mini with batch API, which I recently replaced with mistral-small-latest batch API, which costs $0.10/$0.30 (or $0.05/$0.15 when using the batch API). I may change to 4.1-nano, but I'd have to be overwhelmed by its performance in comparision to mistral.
It's still not as notable as Claude's 1/10th the cost of raw input, but it shows OpenAI's making improvements in this area.