Top
Best
New

Posted by danielhanchen 15 hours ago

Qwen3.5: Towards Native Multimodal Agents(qwen.ai)
345 points | 163 commentspage 2
ranguna 9 hours ago|
Already on open router, prices seem quite nice.

https://openrouter.ai/qwen/qwen3.5-plus-02-15

esafak 1 hour ago|
no caching yet
ggcr 14 hours ago||
From the HuggingFace model card [1] they state:

> "In particular, Qwen3.5-Plus is the hosted version corresponding to Qwen3.5-397B-A17B with more production features, e.g., 1M context length by default, official built-in tools, and adaptive tool use."

Anyone knows more about this? The OSS version seems to have has 262144 context len, I guess for the 1M they'll ask u to use yarn?

[1] https://huggingface.co/Qwen/Qwen3.5-397B-A17B

NitpickLawyer 14 hours ago||
Yes, it's described in this section - https://huggingface.co/Qwen/Qwen3.5-397B-A17B#processing-ult...

Yarn, but with some caveats: current implementations might reduce performance on short ctx, only use yarn for long tasks.

Interesting that they're serving both on openrouter, and the -plus is a bit cheaper for <256k ctx. So they must have more inference goodies packed in there (proprietary).

We'll see where the 3rd party inference providers will settle wrt cost.

ggcr 13 hours ago||
Thanks, I've totally missed that

It's basically the same as with the Qwen2.5 and 3 series but this time with 1M context and 200k native, yay :)

danielhanchen 14 hours ago||
Unsure but yes most likely they use YaRN, and maybe trained a bit more on long context maybe (or not)
Alifatisk 11 hours ago||
Wow, the Qwen team is pushing out content (models + research + blogpost) at an incredible rate! Looks like omni-modals is their focus? The benchmark look intriguing but I can’t stop thinking of the hn comments about Qwen being known for benchmaxing.
sasidhar92 7 hours ago||
Going by the pace, I am more bullish that the capabilities of opus 4.6 or latest gpt will be available under 24GB Mac
Someone1234 7 hours ago|
Current Opus 4.6 would be a huge achievement that would keep me satisfied for a very long time. However, I'm not quite as optimistic from what I've seen. The Quants that can run on a 24 GB Macbook are pretty "dumb." They're like anti-Thinking models; making very obvious mistakes and confusing themselves.

One big factor for local LLMs is that large context windows will seemingly always require large memory footprints. Without a large context window, you'll never get that Opus 4.6-like feel.

codingbear 5 hours ago||
Do they mention the hardware used for training? Last I heard there was a push to use Chinese silicon. No idea how ready it is for use
Matl 10 hours ago||
Is it just me or are the 'open source' models increasingly impractical to run on anything other than massive cloud infra at which point you may as well go with the frontier models from Google, Anthropic, OpenAI etc.?
doodlesdev 9 hours ago||
You still have the advantage of choosing on which infrastructure to run it. Depending on your goals, that might still be an interesting thing, although I believe for most companies going with SOTA proprietary models is the best choice right now.
regularfry 10 hours ago||
If "local" includes 256GB Macs, we're still local at useful token rates with a non-braindead quant. I'd expect there to be a smaller version along at some point.
benbojangles 5 hours ago||
Was using Ollama but qwen3.5 unavailable earlier today
XCSme 7 hours ago||
I just started creating my own benchmarks (very simple questions for humans but tricky for AI, like how many r's in strawberry kind of questions, still WIP).

Qwen3.5 is doing ok on my limited tests: https://aibenchy.com

trebligdivad 11 hours ago|
Anyone else getting an automatically downloaded PDF 'ai report' when clicking on this link? It's damn annoying!
More comments...