Top
Best
New

Posted by Brajeshwar 10/26/2024

Open washing – why companies pretend to be open source(www.theregister.com)
140 points | 88 commentspage 2
meehai 10/26/2024|
I think Open Weights is a better name for AI models that don't share the reproducible training scripts and data.
goku12 10/27/2024|
By that logic, I can call any proprietary program as 'open machine code' or 'open assembly'. If the artifact can't be built or modified easily, then it can't be considered open.
ahaucnx 10/26/2024||
I believe often companies or rather decision makers are afraid of going fully open-source because they invested a lot of money into the product and are afraid some other company uses it, offers it cheaper and ultimately harms the originator.

So even they might believe in open-source they put protections in place that ultimately lock it down and thus make it closed source but trying to keep the impression of being open.

In our journey at AirGradient towards becoming fully open-source hardware (all code and hardware licensed under CC-BY-SA), we had the same concerns but ultimately decided to go full-in and open up everything with an officially approved open-source license.

I believe there are a few important aspects and "protections" that are open-source compatible that help companies protect their investments.

Firstly, requiring Attribution is compatible with open-source and can help companies get a lot of visibility and competitors probably don't want to attribute another company and thus are often not likely to clone.

Secondly, using a share-alike license also makes it unattractive for many other companies using the code.

Lastly, I believe the code itself is often not the valuable part compared to the brand value, employees, reputation, business model, network and implicit knowledge that a company builds up.

It really worked for us to go that way with a true open-source license and I hope many others will do it too.

There are already some easy to understand licenses like CC in place and I do hope that they also create awareness around "open washing".

simonw 10/26/2024||
"Would it surprise you to know that according to the study, the big-name ones from Google, Meta, and Microsoft aren't? I didn't think so."

Microsoft has a decent LLM that I'd consider to be "open source": Phi-3.5, under the MIT license: https://huggingface.co/microsoft/Phi-3.5-vision-instruct

mistrial9 10/26/2024|
https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...
rvnx 10/26/2024||
At the same time Facebook is doing some of the best efforts for open-AI, so it's a bit hard to blame them. They are not perfect but they still spent and shared the most important artifact that was created out of dozens of millions of USD spent (or even more), though not the dataset, but it is really a major advance forward.
rietta 10/26/2024||
I attended the referenced talk by Dan Lorenc in Alpharetta this week. It was very interesting. He hammered on how many licenses flunk the OSI test despite claiming to be open source.
blackeyeblitzar 10/26/2024||
It’s easy. They’re draining the phrase “open source” of meaning while gaining by marketing themselves that way. It’s fraudulent but also just exploitative.
gradientsrneat 10/26/2024||
Article commenter points out that Meta is a funder of the OSI. We'll see if that affects how the OSI defines "open" AI models.

I find it funny how OpenAI was only indirectly mentioned. Still, I'm glad that this columnist is taking a principled stance by arguing aginst one of the more borderline cases.

stonethrowaway 10/26/2024||
I’ve commented on these moves and jukes a few months ago. In the spirit of not reposting, the original is here: https://news.ycombinator.com/item?id=41090142
pabs3 10/27/2024||
I like Debian's policy for libre AI:

https://salsa.debian.org/deeplearning-team/ml-policy/

tzs 10/26/2024|
> The Open Source Initiative (OSI) spells it out in the Open Source Definition, and Llama 3's license – with clauses on litigation and branding – flunks it on several grounds.

Anyone know specifically what he is talking about here?

The only things I'm seeing that I would consider to be clauses on litigation are one that terminates your license if you sue them claiming Llama 3 or its output violates your IP, and the have a choice of venue and choice of forum clause.

Several OSI approved licenses have "terminate on patent suit" clauses. Llama 3 is termination on IP suit rather than just on patent suit but I don't see anything in the OSD where that would make a difference.

There's stuff about trademarks, which I assume are the branding clauses he mentions. But I don't see anything obvious on the OSD that such clauses violate.

simonw 10/26/2024||
The Llama 3 license has all sorts of hokey extra clauses in it:

From https://www.llama.com/llama3/license/

> If, on the Meta Llama 3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.

This seems harmless... until you ask what happens if you start a startup on top of Llama 3, do really well and later try to get acquired by one of the companies that had more than 700m active users on that date (Apple, Microsoft, Google etc)

> You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof).

That's a pretty huge restriction on ways you can use the models. The language "to improve any other large language model" is also incredibly vague.

> (B) prominently display “Built with Meta Llama 3” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama 3” at the beginning of any such AI model name.

I love this one, it means that if you fine-tune a model for erotic furry fan fiction you HAVE to call it "Llama 3 Erotic Furry Fan Fiction Writer" or similar.

throwaway313373 10/27/2024||
> You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof).

How exactly would they know if I do?

Also, it doesn't make any sense that they trained this model using whatever stuff they could download from the Internet but we somehow could bot do the same with their models.

pessimizer 10/26/2024||
But they said "several grounds" in the article. Isn't that enough? Why would you expect them to explain exactly where and how? A license is just a vibe anyway, it's the spirit that's important.
LtWorf 10/26/2024||
The article isn't a dissertation on that topic, you can check more by yourself if you're interested.
tzs 10/26/2024|||
I did check for myself. And failed to find anything in the clauses on litigation or branding that obviously violated anything in OSI's Open Software Definition (OSD).

Hence, the question.

Simonw's response points out some unusual clauses, and at least one of them looks like it might go against one of the requirements in the OSD but it is not a litigation or branding clause and the article specifically called out the litigation and branding clauses.

More comments...