Is there any official source that could confirms if Fable (or Mythos) is parallelized test-time compute (like GPT 5.5 Pro) or sparse Mixture-of-Experts (MoE) transformer combined with a multi-agent, inference-time compute scaling architecture (Gemini 3.1 Deep Think)?
https://deepclause.substack.com/p/how-to-make-small-models-p...
There's also the concept of "smart routing" requests based on some heuristics / embeddings. You'd get "simple" tasks handled by smaller (cheaper) models and use a bigger model to curate / sort / merge the results.
There's a lot of things to try here. I wouldn't personally pay for this service, but I don't think it's "a joke"...
https://news.ycombinator.com/item?id=44630724
They randomly alternated between frontier LLMs and got a massive boost to performance on cybersecurity tasks.
If cost becomes an even bigger problem being able to choose "best performance possible" or "strong but cost effective" will be useful.
OAI/ANT can subsidize their own subscriptions, so it’s hard to compete there. But the results I got from fugu-ultra were impressive.
Personally I prefer understanding the dimensions and the interplay and controlling it though can see why openrouter and others are now offering this a solved solution.
Just be careful when you start outsourcing too much of your intelligence needs to a blackbox.
Do you not worry about giving away your most intimate data to for-profit companies who have not signed to protect your data in a dignified fashion?
This is ask a special orchestrator they built, which is in front of a bunch of models, which model would suit the request best.
Regular Fugu seems to be just "pick the best model and route the request there"
Fugu Ultra can generate like a little mini workflow/plan instead to achieve a result
1. Ask GPT to derive the math. 2. Ask Opus to check for implementation/security issues. 3. Ask Gemini to synthesize or resolve disagreement. 4. Return final answer.
I could be wrong but seems to be that at a glance, so I think it's more dynamic than OpenRouter Fusion.
https://www.databricks.com/blog/introducing-omnigent-meta-ha...
> So basically... openrouter
:skull:
i now really wonder how many people of the public understood my thesis defense lol
After a few months of spending money on the best frontier models, now I am spending time using DeepSeek v4 flash as my workhorse, and flipping to more capable (but still very inexpensive) open models on an as-needed basis. We all make our own tool selection decisions, but for me, I feel happier and enjoy working more following the very fast response and ultra low cost path.
At least, for the initial data gathering phase. You'd probably want a sequence of progressively larger models to filter it.
Have you guys tested it on anything other than research?