Whistleblower: Huawei cloned Qwen and DeepSeek models, claimed as own

Posted by dworks 7/6/2025

Whistleblower: Huawei cloned Qwen and DeepSeek models, claimed as own(dilemmaworks.substack.com)

119 points | 58 comments

maxglute 7/6/2025|

Writer somewhat naive. His Ascend team couldn't get comparable performance (gen1 910A NPUs) initially vs (I assume) Nvidia because obviously. Management supported teams that pivot to cloned alternatives that used GPUs that can be immediately commercialized. Internal office politics make this happen. Ascend team works out kinks (this is huge confirmation), but feel (are) mistreated, i.e. biased bureaucracy, lack of recognition. Many burnout / leave to other Chinese AI companies.

HW strategy/culture has been burning tier1 talent since forever. I remember in the 90s When HW and other domestic PRC telco started poaching from Nortel, Siemens, Lucent etc... the talent (most Chinese diaspora used to comfy western office culture) did not have a good time fitting into an actual Chinese company with Chinese culture (but got paid lots). Many burned out too... yet HW, a particularly extreme outlier of militant work culture, has become dominant..

LBH, both HW post sanctions, is a strategic company, overlapping with semi fabrication, domestic chips, and AI is cubing their strategic value. They can get away with doing anything under the current geopolitical environment to stay dominant. The worthwhile take away from this farewell letter is HW threw enough talent at Ascend that it kind of works now, and potentially can throw enough talent at it to be competitive with Nvidia. AKA how it has always operated, like massive wankers. The intuition from the author and most of us is... you need to reward employees right, cultivate proper workplace environment blah blah blah... but look at HW for the past 30 years. They pay a lot of smart people (including patriotic suckers) A LOT of money, throw them at problems until they break. And win.

rjzzleep 7/7/2025|

This doesn't seem right at all, given that DeepSeek reported massive performance increase because the Huawei team helped them port their LLM to HW infra. I was willing to put it in the "I don't know, maybe, let's see" category, but that comment specifically makes it read like a propaganda piece.

maxglute 7/7/2025||

Ascend 910A was Q1 2022. Qwen 2.5 / Deepseek v3 was Q1 2025. Implied timeline seems to be it took ~2.5 years developement to figure out how to use Ascend somewhat competitively and HW may rival Nvidia with proper support. Where proper support in authors opinion is building a team and treating team well. My guess is Huawei can keep burning talent at problem.

The Huawei+Deepseek > Nvidia for inference claims is based on HW CloudMatrix 384 supernode using all of the optical interconnects and power to allegedly out perform Nvidia cluster. Bottleneck of Nvida cluster is switching. Bottleneck of PRC cluster is chips on old node size (slower or more power hungry). CM384 workaround is 5x more chips 910c chips and 4x more power consumption and connecting cluster with full optical to compete vs GB200 on pure cluster performance. Hard to say what actual economics of that solution is.

bigmattystyles 7/6/2025||

Old maps (and perhaps new ones) used to add fake little alleys so a publisher could quickly spot publishers infringing on their IP rather than going out and actually mapping. I wonder if something similar is possible with LLMs.

tedivm 7/6/2025||

When I was at Malwarebytes we had concerns that IOBit was stealing our database and passing it off on their own. While we had a lot of obvious proof, we felt it wasn't enough for the average person to understand.

To get real proof we created a new program that only existed on a single machine, and then added a signature for that application. This way there could be no claim that they independently added something to their database, as the program was not malware and literally impossible to actually find in the wild. Once they added it to their database we made a blog post and the issue got a lot of attention.

https://forums.malwarebytes.com/topic/29681-iobit-steals-mal...

e9 7/6/2025|||

I was learning OS stuff and made a toy virus for myself back in 1999 and I thought it would be cool if antivirus officially recognized it so I sent a copy to antivirus company(Dr.Web. I think it was called?) and to my surprise now all antivirus databases have it and someone even has gif recording of machine booting up with it… so clearly they must be sharing not just db but also the executables etc

tedivm 7/6/2025||

There are sharing programs between companies, yes, but that isn't what we're talking about here.

belter 7/6/2025|||

> When I was at Malwarebytes

I hope you were not the one that decided to uninstall the product, you need to download a support utility... :-)

landl0rd 7/6/2025|||

The classic example here is subtle, harmless defects/anomalies built into computer chips. Half the stuff china's made is full of these because they're straight ripped from reverse engineering of TI or whomever's stuff.

Very funny that the chinese even do this to each other; equal-opportunity cheats.

throwaway74354 7/6/2025||

It's important part of the culture and is not considered cheating. IP protection laws legal precedents are not the universal truth.

This article on the topic is a good explainer, https://aeon.co/essays/why-in-china-and-japan-a-copy-is-just... , but it's a thoroughly studied phenomenon.

cadamsdotcom 7/7/2025||

Thanks for this read, it really opened my eyes to some things I thought were universal - what copying actually is.

More interestingly that article dives into the reasons why keeping “old stuff” around (instead of renewing it) is only a winning strategy while your society is “only” a few centuries old. The West will one day be old enough that it decides to renew its old stuff too, just like the eternally 20-year-old Japanese temple.

varispeed 7/6/2025|||

I often say an odd thing on public forum or make up a story and then see if LLM can bring it up.

I started doing that once LLM provided me with a solution to a problem that was quite elegant, but was not implemented in the particular project. Turns out it learned it from GitHub issues post that described how particular problem could be tackled, but PR never actually got in.

richardw 7/6/2025||

I’ve wondered whether humans who wanted to protect some areas of knowledge just start writing BS here and there. Organised and large scale, with hidden orchestration channels, it could potentially really screw with models. Put the signal to humans in related but slightly removed places.

Tokumei-no-hito 7/6/2025|||

i have come across this one for example https://github.com/sentient-agi/OML-1.0-Fingerprinting

> Welcome to OML 1.0: Fingerprinting. This repository houses the tooling for generating and embedding secret fingerprints into LLMs through fine-tuning to enable identification of LLM ownership and protection against unauthorized use.

NitpickLawyer 7/6/2025||

Would be interesting to see if this kind of watermarking survives the frankenstein types of editing they are presumably doing. Per the linked account, they took a model, changed tokenizers, and added layers on top. They then presumably did some form of continued pre-training, and then post-training. It would have to be some very resistant watermarking to survive that. It's not as simple as making the model reply with "my tokens are my passport, verify me" when you ask them the weather in NonExistingCity... Interesting nonetheless.

Tokumei-no-hito 7/6/2025||

i have never used it and have limited understand of fine tune models. i only remember see this a few weeks ago and your comment reminds me. i am curious too.

ateng 7/6/2025|||

Youtuber Jay Foreman made a video about fake alleys in maps https://www.youtube.com/watch?v=DeiATy-FfjI

yorwba 7/6/2025||

The original whisteblower article in Chinese at the bottom (but not the English version at the top) has this part:

实际上，对于后续训了很久很久的这个模型，Honestagi能够分析出这个量级的相似性我已经很诧异了，因为这个模型为了续训洗参数，所付出的算力甚至早就足够从头训一个同档位的模型了。听同事说他们为了洗掉千问的水印，采取了不少办法，甚至包括故意训了脏数据。这也为学术界研究模型血缘提供了一个前所未有的特殊模范吧。以后新的血缘方法提出可以拿出来溜溜。

In fact, I'm surprised that HonestAGI's analysis could show this level of similarity for this model that had been post-trained for a long time, because the computing power used to train-wash the parameters of this model was enough to train a model of the same size from scratch. I heard from my colleagues that they took many measures to wash off Qwen's watermark, even deliberately training on dirty data. This also provides an unprecedented case study for the academic community studying model lineage. If a new lineage method is put forward in the future, you can take it for a spin.

egypturnash 7/6/2025||

LLMs are apparently completely incompatible with copyright anyway, so if you can train them without paying a single dime to anyone whose work you ingest, then you should be able to clone them for free. What goes around comes around.

mensetmanusman 7/6/2025|

They are naïvely incompatible, but lawyers will find a way to make it not so.

jauntywundrkind 7/6/2025||

Meanwhile Apple legitimately built on Qwen2.5-Coder-7B, adding some of their own novel ideas. It mostly seems like custom training for their own code examples, but notably if you turn the temperature up, it can write multiple blocks of code out of order.

https://9to5mac.com/2025/07/04/apple-just-released-a-weirdly... https://news.ycombinator.com/item?id=44472062

option 7/6/2025||

"Organization: We belong to the “Fourth Field Army” initiative. Under its structure, core language large models fall under the 4th brigade; Wang Yunhe’s small-model group is the 16th brigade."

- Lol, what? So is this literally a part of CCP military?

tedivm 7/6/2025||

I don't think so. The Fourth Field Army doesn't exist anymore (and hasn't since 1955). My guess is the company named their LLM initiative after this for historic reasons, and that these are more like internal project code names than anything else.

neurostimulant 7/7/2025||

Huawei has a militarily-style culture. Even their new employee orientation is run like an army boot camp.

https://archive.is/wvbca

throwaway48476 7/6/2025||

Chinese efficiency. The west is held back by archaic IP laws.

JPLeRouzic 7/6/2025||

That's a very human and very honest report. It presents the confusion there is in some big companies and how the pressure by the management favors dishonest teams. The writer left the company. I hope he is well; he is a fine person.

dworks 7/6/2025|

Yes. In fact, this report should be written in the context of other farewell letters to employers that have been published recently in China. There has recently been one, by a 15-year Alibaba veteran, who decried the decline of the company culture as a cause of its now lacking competitiveness and inability to launch new products.

The issues in this report are really about: 1. Lies about Huawei's capabilities to the country (important national issue) 2. Lies to customers who paid to use Huawei models 3. A rigid, KPI-focused and unthinking organization where dishonest gaming of the performance review system not only works but seems to be the whole point and is tacitly approved (this and the reporters idealism and loss of faith is the main point of the report as I see it)

yorwba 7/6/2025||

I think the reporter's motivations would've come across more clearly if you had posted a paragraph-by-paragraph translation instead of the current abridged version. (I assume Dilemma Works is your Substack.) Lots of details that add color to the story got lost.

option 7/6/2025||

Doesn't feel like a healthy culture, IF true. Also, apparently current DeepSeek lab members aren't allowed to travel to conferences. This is all maybe good for execution but absolutely not for innovation

gausswho 7/6/2025||

"Saturday was a working day by default, though occasionally we had afternoon tea or even crayfish."

Unexpected poetry. Is there a reason why crayfish would be served in this context?

tecleandor 7/6/2025|

I understood like "even as they made us work on Saturday, we sometimes had the luck of having some afternoon snack", and I guess crayfish might be popular there. Or maybe it's a mistranslation.

alwa 7/6/2025|||

Immensely popular, delicious, and very beautiful on a plate or in a bowl, both whole/boiled/stir-fried and as snack packs of pre-peeled tails! See, e.g.,

https://mychinesehomekitchen.com/2022/06/24/chinese-style-sp...

So yes, I read it the same way you do: “They made us work weekends, but at least they’d order us in some pizzas.”

(…and if you’re in the US, you can have them air-freighted live to you, and a crawfish boil is an easy and darn festive thing to do in the summer. If you’re put off by the crustacean staring back at you, and you have access to a kitchen that operates in a Louisianan style, you might be able to find a “Cajun Popcorn” of the tails seasoned, battered, and fried. Or maybe one of the enormous number of “seafood boil” restaurants that have opened in the US in recent years.)

(I feel like those establishments came on quickly, that I notice them mainly in spaces formerly occupied by American-Chinese restaurants, and that it’s felt like a nationwide phenomenon… I suspect there’s a story there for an enterprising young investigative nonfiction writer sort.)

tecleandor 7/6/2025||

Oh! That sounds tasty. I'm in EU, but I'm gonna take note of both. Thanks.

sui762o 7/6/2025|||

[dead]

matt3210 7/6/2025|

The question is who really made the original models?

More comments...