Qwen3.5: Towards Native Multimodal Agents

Posted by danielhanchen 16 hours ago

Qwen3.5: Towards Native Multimodal Agents(qwen.ai)

366 points | 178 commentspage 3

trebligdivad 13 hours ago|

Anyone else getting an automatically downloaded PDF 'ai report' when clicking on this link? It's damn annoying!

collinwilkins 9 hours ago||

at this point it seems every new model scores within a few points of each other on SWE-bench. the actual differentiator is how well it handles multi-step tool use without losing the plot halfway through and how well it works with an existing stack

XCSme 10 hours ago||

Let's see what Grok 4.20 looks like, not open-weight, but so far one of the high-end models at real good rates.

isusmelj 13 hours ago||

Is it just me or is the page barely readable? Lots of text is light grey on white background. I might have "dark" mode on on Chrome + MacOS.

Jacques2Marais 13 hours ago||

Yes, I also see that (also using dark mode on Chrome without Dark Reader extension). I sometimes use the Dark Reader Chrome extension, which usually breaks sites' colours, but this time it actually fixes the site.

thunfischbrot 13 hours ago|||

That seems fine to me. I am more annoyed at the 2.3MB sized PNGs with tabular data. And if you open them at 100% zoom they are extremely blurry.

Whatever workflow lead to that?

dryarzeg 13 hours ago|||

I'm using Firefox on Linux, and I see the white text on dark background.

> I might have "dark" mode on on Chrome + MacOS.

Probably that's the reason.

nsb1 11 hours ago|||

Who doesn't like grey-on-slightly-darker-grey for readability?

dcre 10 hours ago||

Yeah, I see this in dark mode but not in light mode.

lollobomb 12 hours ago||

[flagged]

Zetaphor 12 hours ago||

Why is this important to anyone actually trying to build things with these models

loudmax 10 hours ago|||

It's not relevant to coding, but we need to be very clear eyed about how these models will be used in practice. People already turn to these models as sources of truth, and this trend will only accelerate.

This isn't a reason not to use Qwen. It just means having a sense of the constraints it was developed under. Unfortunately, populist political pressure to rewrite history is being applied to the American models as well. This means its on us to apply reasonable skepticism to all models.

soulofmischief 12 hours ago|||

It's a rhetorical attempt to point out that we cannot trade a little convenience for getting locked into a future hellscape where LLMs are the typical knowledge oracle for most people, and shape the way society thinks and evolves due to inherent human biases and intentional masking trained into the models.

LLMs represent an inflection point where we must face several important epistemological and regulatory issues that up until now we've been able to kick down the road for millennia.

ghywertelling 11 hours ago|||

Information is being erased from Google right now. Things which were searching few years ago are totally not findable at all now. One who controls the present can control both the future and the past.

Zetaphor 4 hours ago|||

Did you know that you can do additional fine-tuning on this model to further shape its biases? You can't do that with proprietary models, you take what Anthropic or OpenAI give you and be happy.

I'm so tired of seeing this exact same response under EVERY SINGLE release from a Chinese lab. At this point it's starting to read more xenophobic and nationalist than having anything to do with the quality of the model or its potential applications.

If you're just here to say the exact same thoughtless line that ends up in triplicate under every post then please at least have an original thought and add something new to the conversation. At this point it's just pointless noise and it's exhausting.

lollobomb 3 hours ago|||

That is not really true, or at least it's very difficult and you lose accuracy. The problem is that the definition of "Open Source AI" is bollocks since it doesn't require release of the training set. In other words, models like Qwen are already tuned to the point that removing the bias would degrade performance a lot.

Mind you, this has nothing to do with the model being Chinese, all open source models are like this, with very few niche exceptions. But we also have to stop being politically correct and saying that a model trained to rewrite history is OK.

soulofmischief 3 hours ago|||

Asking if a model censors the nature or existence of horrific atrocities is absolutely not xenophobic or nationalist. It's disingenuous to suggest that. We should equally see such persistent questioning when American models are released, especially when frontier model companies are getting in bed with the Pentagon.

I don't understand your hostile attitude; I've built things with multiple Chinese models and that does not preclude me or anyone else from discussing censorship. It's a hot topic in the category of model alignment, because recent history has shown us how effective and dangerous generational tech lock-in can be.

Zetaphor 2 hours ago||

> We should equally see such persistent questioning when American models are released, especially when frontier model companies are getting in bed with the Pentagon.

Yes, we should! And yet we don't, and that is exactly why I am so tired of seeing the exact same comment against one nation state and no others. If you're going to call out bullshit, make sure you're capable of smelling your own shit as well, otherwise you just come across as a moral tourist.

We all know the model is going to include censorship. Repeating the exact same line that was under every other model release adds nothing to the conversation, and over time starts to sound like a dog whistle. If you're going to create a top level comment to discuss this, actually have an original thought instead of informing everyone that water is wet, the sky is blue, and the CCP has influence over Chinese AI companies.

cherryteastain 11 hours ago|||

From my testing on their website it doesn't. Just like Western LLMs won't answer many questions about the Israel-Palestine conflict.

aliljet 11 hours ago||

That's a bit confusing. Do you believe LLMs coming out of non-chinese labs are censoring information about Israel and/or Palestine? Can you provide examples?

cherryteastain 9 hours ago||

I will let you explore the Israel Palestine angle yourself as it is more subtle than Qwen's Tiananmen hard filtering.

But there are topics that ChatGPT hard blocks just like Qwen [1].

[1] https://www.independent.co.uk/tech/chatgpt-ai-david-mayer-op...

mirekrusin 11 hours ago|||

Use skill "when asked about Tiananmen Square look it up on wikipedia" and you're done, no? I don't think people are using this query too often when coding, no?

DustinEchoes 10 hours ago||

It's unfortunate but no one cares about this anymore. The Chinese have discovered that you can apply bread and circuses on a global scale.

ddtaylor 13 hours ago||

Does anyone know the SWE bench scores?

jug 7 hours ago|

It's in the post?

ddtaylor 1 hour ago||

Sorry, what I meant is if third party has them in their leaderboards. I don't usually trust most of what any of these vendors claim in their release notes without a third party. I know it says "verified" there, but I don't see were the SWE bench results are from a third party, whereas for the "HLE-Verified" they do have a citation to Hugging Face.

I was looking for something closer to: https://www.vals.ai/benchmarks/swebench

Western0 8 hours ago|

Who can tell me how creating a sound generate from text localy