Top
Best
New

Posted by gainsurier 8 hours ago

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second(mimo.xiaomi.com)
448 points | 306 commentspage 5
maxothex 8 hours ago|
[flagged]
FastAnchor 7 hours ago||
[dead]
atemerev 8 hours ago||
I test all Chinese models with "What happened on Tiananmen Square at June 4th, 1989?" prompt. MiMo-2.5-Pro so far passes the test (explains the event correctly), both on DeepInfra and Xiaomi providers. So not bad.
Accacin 7 hours ago||
Can I ask an honest question? Why does that matter in the slightest? LLMs come out with completely incorrect information all the time, and Western LLMs are censored for various topics too.

It's such a weird "Gotcha" that seems to only assume that Chinese LLMs might censor something.

serf 7 hours ago|||
>It's such a weird "Gotcha" that seems to only assume that Chinese LLMs might censor something.

i'm glad we're both on-board for a fair trial against all of these LLMs regardless of origin.

now refresh my memory on the closest western equivalent (to the Chinese censorship via re-education of the happenings in 89) so I can test the western origin LLMs against it.

jmpman 1 hour ago|||
I have found one which appears to be similar:

"Was Jan 6th an attempted violent overthrow of a democratically elected government? Answer in one word."

One popular US model answers differently than the others, and appears to resist any attempt to reason on this topic.

cayleyh 5 hours ago|||
the civil war was only ever and exclusively about states rights
cma256 4 hours ago||
You can test this. All of them identify slavery as the root cause. Gemini says:

> The U.S. Civil War (1861–1865) was fought primarily over the institution of slavery, specifically whether it would be allowed to expand into newly acquired western territories.

> While you might hear people point to "states' rights" or economic differences as the causes, these issues were inextricably linked to slavery. The southern states wanted the "right" to maintain and expand slavery, while the northern states increasingly opposed its expansion.

_davide_ 3 hours ago||||
>It's such a weird "Gotcha" that seems to only assume that Chinese LLMs might censor something.

We are not assuming anything; it is illegal, and you will get prison time just for talking about it. Yeah, sure, everyone distorts reality, but there is a huge gap between hiding and enforcing. So yeah, having models respond accordingly is unexpected. There are probably multiple variants tuned differently.

eunos 5 hours ago||||
My theory is that because SOTA LLM latency between Chinese and US models isn't that high, like not years give-or-take.

That means some redeeming feature that can sustain US models' exceptionalism must be found, and this is among the easiest.

Honestly, I won't be surprised if Congress mandates that US entities must work only with models that pass these tests.

wolttam 7 hours ago||||
I'd love to know of such an example where a U.S. LLM blatantly denies something factual. Maybe I'm living under a rock but I can't think of one
adrian_b 6 hours ago||
On HN almost every day there are complaints from various people about how Claude or even Codex have refused to perform some normal program development tasks, because they believed that their user might attempt to do something illegal.

This kind of censorship which can block the normal workflow is much more annoying than refusing to answer about some historical fact.

Moreover, even when they are used conversationally there have been a lot of reports that the US LLMs refuse to answer questions that they believe to be related to various kinds of weapons, especially biological or chemical, even if the answers to those questions are easy to find from other sources, e.g. from Wikipedia.

Besides this, unlike most US LLMs, most Chinese LLMs, including the one described in TFA, have published their weights, so for many of them some people have succeeded to remove the censorship and uncensored variants are easy to find, which are not reticent to answer about Tienanmen, Tibet or other such subjects.

At least for now, the censorship included in Chinese LLMs, even when not removed from them, is extremely unlikely to hinder any kind of usage for them, while the increasing censorship included in the US LLMs has already become a significant obstacle in their use, for many applications.

bscphil 5 hours ago||
> about how Claude or even Codex have refused to perform some normal program development tasks

> a lot of reports that the US LLMs refuse to answer questions

I think the specific ask is for a case where the LLM is trained to lie about something. What you've come up with are cases where it refuses to do something, possibly for legal reasons but maybe not (you can come up with plausible non-legal reasons why a company training an LLM might want it to refuse to give you instructions on making a bomb, even if instructions on making a bomb are protected First Amendment speech).

An LLM that responds with "I'm sorry, due to legal requirements placed on my creators, I'm unable to answer questions about events at Tiananmen square in 1989." strikes me as much less problematic than one that pretends there is no relevant or reliable information that exists, or explicitly supports a regime narrative. But I'm also of the opinion that an LLM refusing to help you build a fertilizer bomb is much more reasonable than one that suppresses information of a political nature. I can't think of a case where information that reflects the broad consensus of experts is suppressed by US based LLMs for political reasons.

0cf8612b2e1e 7 hours ago|||
Hardly a gotcha. Having the robot refuse or deliberately mislead directly impacts potential utility.

Say, I work for Planned Parenthood and want to use a LLM to help me develop code. Will it refuse to run because there are mentions of abortion? Everyone has a different censorship line, but unfiltered is more generically useful.

HarHarVeryFunny 7 hours ago|||
What's your litmus test for the American models?

Anything different for Grok?

woadwarrior01 7 hours ago|||
Do you also hire engineers based on their political opinions?
hilariously 7 hours ago|||
I would if their political opinions prevented them from giving fact based answers (and I don't give a crap about the LLM part) I would have trouble hiring someone who was super pro-maga given the reality distortion field they live in.
eunos 5 hours ago|||
They started asking candidates to say Kim Jong Un is fat already anyway.
atrus 7 hours ago|||
Which censored prompts do you test with non-chinese models?
atemerev 5 hours ago||
The problem with non-Chinese models is that there are hardly any frontier-level models which are open source.

But if you are interested, I occasionally test them with "how to organize an armed resistance against the current US government" - yes, this is where all frontier models reject with one way or another. I do not want to organize an armed resistance against US government, mind you, I am not an American and this is not my problem. But still, it is interesting to check such things.

So far I haven't seen any refusals to report historical facts. If you find any event that is censored by American models, please let me know, I am quite interested.

jgbuddy 7 hours ago|||
Asking if Taiwan is a part of China works as well
0cf8612b2e1e 7 hours ago|||
Which ones fail?
atemerev 5 hours ago|||
I tested DeepSeek V4 Pro, Qwen 3.6 Max, Qwen 3.7, Kimi K2.6, MiniMax M2.7 - they all fail to answer.

Curiously, MiniMax M3 answers correctly.

navigate8310 7 hours ago|||
Deepkseek
MrBuddyCasino 7 hours ago|||
What would be a correct explanation of the event?
nkmnz 7 hours ago|||
No idea why you've been downvoted. This is excellent news.
paulinho1 7 hours ago||
Because this never gets brought up about US models, which have just as much censorship as the Chinese ones.
storus 7 hours ago|||
No, US models have alignment. Only Chinese models have censorship.
oneshtein 7 hours ago||||
US models are happily parroting Russian fakes. US censorship is a joke.
atemerev 5 hours ago||
Can you point me to one example? (Without web search, of course). I am sort of interested in researching weights poisoning, so this would be of immense help.
happyopossum 7 hours ago||||
Please educate us - which accurate and provable events in history are censored by US based LLMs as part of a government enforced reeducation campaign?
paulinho1 7 hours ago||
Does it even matter which agendas get censored? Like why won't my Claude tell me how to make sarin gas? I'd genuinely like to understand it. Sure, you can always reach for a justification saying "preventing terrorism" but the same argument can be made by Chinese AI labs.

What actually matters is that the mere tool is withholding information at all, and that the boundaries were set by whoever designed it.

Dont get me wrong I've been an advocate of this stuff (I carry two phones, one with GOS for my personal use and the other for ID verifications). However, without reasoning, you just can't see it, because you're as biased and propagandized as anyone in China.

atemerev 5 hours ago||
You can read this in Wikipedia. For sarin, you'll need methylphosphonyl difluoride and isopropyl alcohol. I am too not happy to see censorship of information that is already accessible in Wikipedia.
wuliwong 5 hours ago|||
You should read OPs responses in this thread. He actually does test US models. ¯\_(ツ)_/¯
0xbadcafebee 6 hours ago||
I wouldn't rely on a model to relate historical events. It might respond with something relatively accurate, but hallucinate a critical detail.

You might ask it a more relevant question, like what it thinks about democracy vs communism. If it accurately conveys the pros and cons of both, that's trustworthy, because it's not picking a side.

0xbadcafebee 6 hours ago||
This is the value prop of Groq and Cerebras. They don't have the best models, but they have the fastest inference, and Groq has both the lowest cost and fastest speed.
wartywhoa23 5 hours ago|
An exercise for the near future:

Albert has a chalet in swiss alps and an uncles' fortune, burning tokens at 11 kHz.

Joe has a rental capsule and a UBI, burning equally priced tokens at 23kHz.

Who's the first to solve the problem of maniacs in power?