GPT-4.1 in the API - Hacker News

Posted by maheshrijal 4/14/2025

680 points | 492 commentspage 8

lich-001 4/15/2025|

I wish they would deprecate all existing ones when they bake a new model instead of aiming for pointless model diversity.

croemer 4/14/2025||

Testing against unspecified other "leading" models allows for shenanigangs:

> Qodo tested GPT‑4.1 head-to-head against other leading models [...] they found that GPT‑4.1 produced the better suggestion in 55% of cases

The linked blog post goes 404: https://www.qodo.ai/blog/benchmarked-gpt-4-1/

gs17 4/14/2025|

The post seems to be up now and seems to compare it slightly favorable to Claude 3.7.

croemer 4/14/2025||

Right, now it's up and comparison against Claude 3.7 is better than I feared based on the wording. Though why does the OpenAI announcement talk of comparison against multiple leading models when the Qodo blog post only tests against Claude 3.7...

__mharrison__ 4/14/2025||

I know this is somewhat off topic, but can someone explain the naming convention used by OpenAI? Number vs "mini" vs "o" vs "turbo" vs "chat"?

iteratethis 4/14/2025|

Mini means the size of the model (less parameters)

"o" means "omni", which means its multimodal.

simianwords 4/14/2025||

Could any one guess the reason as to why they didn't ship this in the chat UI?

simianwords 4/15/2025||

Answering my own question after some research. It looks like OpenAI decided not to introduce 4.1 in ChatGPT UI because 4.1 is not necessarily a better model than 4o because it is not multi modal.

Now you can imagine introducing a newer "type" of model like 4.1 that's better at following instructions and better at coding to bring a sort of overhead thats already too much with the given options.

OpenAI confirmed somewhere that they have already incorporated the enhancements made in 4.1 to 4o model in ChatGPT UI. I assume they would delegate to 4.1 model if the prompt doesn't require specific 4o capabilities.

Also one of the improvements made to 4.1 is following instructions. This type of thing is better suited for agentic use cases that are typically used in the form of an API.

KoolKat23 4/14/2025||

The memory thing? More resources intensive?

bli940505 4/14/2025||

Does this mean that the o1 and o3-mini models are also using 4.1 as the base now?

soheil 4/14/2025||

Main takeaways:

- Coding accuracy improved dramatically

- Handles 1M-token context reliably

- Much stronger instruction following

p1dda 4/15/2025||

LLMs are not intelligent

LeicaLatte 4/14/2025||

i've recently set claude 3.7 as the default option for customers when they start new chats in my app. this was a recent change, and i'm feeling good about it. supporting multiple providers can be a nightmare for customer service, especially when it comes to billing and handling response quality queries. with so many choices from just one provider, it simplifies things significantly. curious about how openai manages customer service internally.

yieldcrv 4/14/2025|

More season 4’s than attack on titan

More comments...