> Qodo tested GPT‑4.1 head-to-head against other leading models [...] they found that GPT‑4.1 produced the better suggestion in 55% of cases
The linked blog post goes 404: https://www.qodo.ai/blog/benchmarked-gpt-4-1/
"o" means "omni", which means its multimodal.
Now you can imagine introducing a newer "type" of model like 4.1 that's better at following instructions and better at coding to bring a sort of overhead thats already too much with the given options.
OpenAI confirmed somewhere that they have already incorporated the enhancements made in 4.1 to 4o model in ChatGPT UI. I assume they would delegate to 4.1 model if the prompt doesn't require specific 4o capabilities.
Also one of the improvements made to 4.1 is following instructions. This type of thing is better suited for agentic use cases that are typically used in the form of an API.
- Coding accuracy improved dramatically
- Handles 1M-token context reliably
- Much stronger instruction following