Huawei cloned Qwen and DeepSeek models, claimed as own

Posted by dworks 23 hours ago

Huawei cloned Qwen and DeepSeek models, claimed as own(dilemmaworks.substack.com)

114 points | 56 commentspage 2

option 20 hours ago|

Doesn't feel like a healthy culture, IF true. Also, apparently current DeepSeek lab members aren't allowed to travel to conferences. This is all maybe good for execution but absolutely not for innovation

tengbretson 22 hours ago||

In the LLM intellectual property paradigm, I think this registers as a solid "Who cares?" level offence.

brookst 22 hours ago||

The point isn’t some moral outrage over IP, the point is a company may be falsely claiming to have expertise it does not have, which is meaningful to people who care about the market in general.

tonyedgecombe 22 hours ago|||

Nobody who pays attention to Huawei will be surprised. They have a track record of this sort of behaviour going right back to their early days.

npteljes 22 hours ago||

While true, these sorts of reports are the track records which we can base our assessments on.

some_random 22 hours ago|||

Claiming to care deeply about IP theft in the more nebulous case of model training datasets then dismissing the extremely concrete case of outright theft seems pretty indefensible to me.

Arainach 22 hours ago|||

Everyone has a finite amount of empathy, and I'm not going to waste any of mine on IP thieves complaining that someone stole their stolen IP from them.

mensetmanusman 20 hours ago||

It’s theft in the way taking a picture of nature that you had nothing to do with is theft.

Arainach 20 hours ago||

This line of argument was worn out and tired when 14 year olds on Napster were parroting it in 1999.

pton_xd 22 hours ago||||

> dismissing the extremely concrete case of outright theft seems pretty indefensible to me.

Outright theft is a meaningless term here. The new rules are different.

The AI space is built on "traditionally" bad faith actions. Misappropriation of IP by using pirated content and ignoring source code licenses. Borderline malicious website scraping. Recitation of data without attribution. Copying model code / artifacts / weights is just the next most convenient course of action. And really, who cares? The ethical operating standards of the industry have been established.

perching_aix 22 hours ago|||

Par for the course for emotional thinking, I'm not even surprised anymore.

didibus 22 hours ago|||

Ya, the models have stolen everyone's copyrighted intellectual property already. Not sure I have a lot of sympathy, in fact, the more the merrier, if we're going to brush off that they're all trained on copyrighted material, might as well make sure they end up a really cheap, competitive, low margin, accessible commodity.

lambdasquirrel 22 hours ago||

Eh... you should read the article. It sounds like a pretty big deal.

didibus 18 hours ago||

I did read the article, appart for that it sounds like a terrible place to work, I'm not sure I see what's the big deal?

No one knows how any of the models got made, their training data is kept secret, we don't know what it contains, and so on. I'm also pretty sure a few of the main models poached each others employees which just reimplemented the same training models with some twists.

Most LLMs are also based on initial research papers where most of the discovery and innovation took place.

And in the very end, it's all trained on data that very few people agreed or intended would be used for this purpose, and for which they all won't see a dime.

So why not wrap and rewrap models and resell them, and let it all compete for who offers the cheapest plan or per-token cost?

esskay 22 hours ago|||

It is very hard to have any sympathy, they stole stolen material from people known to not care they are stealing.

mathverse 22 hours ago||

[flagged]

oblio 22 hours ago||

Didn't know Sam Altman was Chinese :-)

typon 22 hours ago||

LLMs are all built on stolen data. There is no such thing as intellectual property in LLMs.

mattnewton 22 hours ago||

That’s not the point IMO; the point was this was being used to display capabilities to train models with Huawei software and hardware.

mensetmanusman 20 hours ago||

/robots that read books in the library are stealing/

hereme888 22 hours ago||

[flagged]

jambutters 21 hours ago|

I don't think anyone cares about that. OpenAI ripped off of the internet and books. Deepseek distilled some of openAI and pushed the field forward

mystraline 21 hours ago|

[flagged]

knowitnone 21 hours ago|

"Google cloned Linux kernel, claimed as own." Link?

owebmaster 20 hours ago||

Android