Posted by chabons 3 hours ago
Got shitcanned due to bad PR & Zuck God-King terraforming the org, so there'd be a year delay to next release.
Real tragi-comedy, and you have no idea how happy it makes me to see someone in the wild saying this. It sounds so bizarre to people given the conventional wisdom, but, it's what happened.
They beat Gemini 2.5 Flash and Pro handily on my benchmark suite. (tl;dr: tool calling and agentic coding).
Llama 4 on Groq was ~GPT 4.1 on the benchmark at ~50% the cost.
They shouldn't have released it on a Saturday.
They should have spent a month with it in private prerelease, working with providers.[1]
The rushed launch and ensuing quality issues got rolled into the hypebeast narrative of "DeepSeek will take over the world"
I bet it was super fucking annoying to talk to due to LMArena maxxing.
[1] my understanding is longest heads up was single-digit days, if any. Most modellers have arrived at 2+ weeks now, there's a lot between spitting out logits and parsing and delivering a response.
It doesn't though
People like to hate on Meta regardless of anything, and regardless of whether it's justified or not. Not saying it isn't, just that it's many people's default bias.
- Hacker News Guidelines https://news.ycombinator.com/newsguidelines.html
While working on a web-based graphics editor, I've noticed that users upload a lot of PNG assets with this problem. I've never tracked down the cause... is there a popular raster image editor which recently switched to dithered rendering of gradients?
The result for that specific image is: 500kb. 85% decrease in size
(But today is not that day.)
Major analytical errors in their response to multiple of my technical questions.
So many different companies are going to have similarly powerful ai that there will be no moat around it and it will be cheap. They will never earn their investment back.
That said, there's nothing like the real thing.
The risk is something like the railroad bubble and the dotcom. Over-investement, circular revenue and a timeline that doesn't work.
Or, maybe it'll work out.
And further down the line in chips, which is why Elon is building a fab now.
There are plenty of capable models on HuggingFace, yet I have no way of running them.
If the average user gets convinced they could run LLMs for cheap at home, you cannot trap users in your walled garden anymore.
I was saying this for years about Tesla’s FSD - they finally had to give in and drop the price to stay competitive.
At least he says he's doing that. It doesn't really make sense since you're not going to achieve an advanced node from a standing start in a practical time frame and cost.
Sounds like more Musk flavored vapor.
They already announced a partnership with Intel.
Not sure what this is now.
You are right though. Meta could have been in lockstep releasing ChatGPT features into some chat bot on Facebook.com but instead it seemed like their FAIR arm was hell bent on commoditising this stuff by publishing their research models before the Chinese companies took the lead in that.
It’s hard for me to be mad at FAIR even though I general disagree with the outcomes that Meta produce for their users.
Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has
I think it’s unrealistic to expect them to come back from that pit to the top in one year, but I wouldn’t rule them out getting there with more time. That’s a possible future. They have the money and Zuckerberg’s drive at the helm. It can go a long way.
If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)
Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.
Might as well not release anything.
Yup, it's called test-time compute. Mythos is described as plenty slower than Opus, enough to seriously annoy users trying to use it for quick-feedback-loop agentic work. It is most properly compared with GPT Pro, Gemini DeepThink or this latest model's "Contemplating" mode. Otherwise you're just not comparing like for like.
Why can't others easily replicate it?
How does one get their hands on these models? They are not open-source, right? I go to meta.ai, but it's just a chat interface---no equivalent to codex or claud code? Can you use this through OpenCode? Is meta charging for model access, or is the gathering of chat data a sufficiently large tithe?
from Facebook Newsroom: https://about.fb.com/news/2026/04/introducing-muse-spark-met...
Note: I'm expressing some skepticism here largely due to how recent rollouts from Meta flopped. Sincerely hoping that they do better this time around!
Just a speculation, I have no real knowledge about it.