Posted by T-A 1 day ago
Not looking good so far
I think a problem with open-weight models is that while you can improve them, you are not going to create the next generation of LLMs by fine-tuning. We are at the mercy of frontier labs for access to SOTA LLMs. For example, Anthropic recently started requiring identity verification for Claude [0], same for OpenAI [1].
If one day China's distillation labs stop releasing their LLMs as open-weight, I doubt American labs will continue to release free LLM weights without that competition.
That's where fully open pipelines shine: they enable the community to create the next generation of SOTA LLMs. That is the only way LLMs truly become sovereign.
This notion that Chinese labs are merely distilling frontier models is quite an unwarranted slur. Those labs have published WAY more useful research than US labs on RL techniques, novel model architectures, training pipelines, etc. They have also hit intelligence-per-parameter densities that US labs have yet to attain.
Apart from that, merely training a model on outputs from another model, off policy and without the logits, doesn’t really work that well.
The Chinese labs know how to build frontier level models. GLM-5.2 shows that they no longer even need Nvidia chips to do it.
Chinese labs are basically just telling everyone, out in the open, what they're doing and how to do it, and the answer from American frontier labs is "Well, they couldn't possibly be getting the results they're getting without just distilling our models," and the American labs aren't even trying to do some of the stuff like DS's aggressive caching to get costs down.
it happens to all models…when the internet is increasingly generated, things happen
I disagree with this use of SOTA, and this topic is why.
Anthropic and OpenAI have “cutting-edge” models. These are beyond the state of the art but they are closed, secretive, hard to quantify.
The “state of the art” is open source, open weights models that can be inspected, studied, shared and critiqued, because that is what is meant by “the art” —- it is the knowledge and principles and evidence and materials available to all. The “state of the art” is the highest point of that.
I wish we could make this distinction and stop blessing two secretive, unverifiable loss-making companies with so much power.
(Putting that aside, I suspect — without evidence, mind you - that the endless march to solving models by making them bigger is not the solution anyway.)
Chinese's model like GLM is getting better for coding task and its cheaper. Microsoft Github copilot have to switch billing to token based. the cost of AI have increased since agent come into play. whoever can offer cheaper token to do task will win.
even Microsoft is looking into Deepseek for cheap token.
https://www.axios.com/2026/06/16/microsoft-copilot-cowork-to...
But "state of the art" implies the highest state of general availability, not just in terms of access to some product, but of use of the ideas, concepts, methodologies etc.
Anthropic and OpenAI have "cutting edge" models; the state of the art is behind the cutting edge.
The state of the art is the best open source, open weights model available. More or less by definition.
I am probably tilting at windmills here.
But the way SOTA is generally understood by other users of the language, it refers to exactly the team, technology, & techniques defining the cutting edge in any field, regardless of the whether the technology & techniques are available outside of that team...
https://english.stackexchange.com/questions/239963/do-state-...
its things you would be trained in as part of a bachelor's degree and some graduate coursework