Top
Best
New

Posted by denysvitali 9/2/2025

Apertus 70B: Truly Open - Swiss LLM by ETH, EPFL and CSCS(huggingface.co)
322 points | 61 commentspage 2
WhitneyLand 9/5/2025|
This is an impressive milestone.

It’s easy to become jaded with so many huge models being released, but the reality is they are still from a relatively small group of countries.

For example India has no indigenous models this big despite having a world class talent pool.

porridgeraisin 7 days ago|
> talent pool

Capital though ;)

[I am a grad student here in reinforcement learning]

Anyways, among all the VC/made-at-home driven snake oil, I'd say you should look at sarvam.ai, they are the most focussed and no-nonsense group. They have a few good from-scratch models (I believe upto 7B or 14B), as well as a few llama finetunes. Their API is pretty good.

The main thing folks here are attempting are to get LLMs good at local indian languages (and I don't mean hindi). I don't think people see a value in creating an "indigenous llama" that doesn't have that property. For this, the main bottleneck is data (relatively speaking, there is zero data in those languages on the internet), so there's a team AI4Bharat whose main job is curating datasets good enough to get stuff like _translation_ and other NLP benchmarks working well. LLMs too, for which they work with sarvam frequently.

coalteddy 7 days ago||
Very cool. Love this. Was the training more heavily weighted towards swiss languages and how does the model perform on swiss languages compared to others?

Are there any plans for further models after this one?

lllllm 7 days ago|
The pretraining (so 99% of training) is fully global, in over 1000 languages without special weighting. The posttraining (See section 4 of the paper) had also as many languages as we could get, and did upweight some languages. The posttraining can easily be customized to any other target languages
cwillu 7 days ago||
“The file reflects data protection deletion requests which have been addressed to SNAI as the developer of the Apertus LLM. It allows you to remove Personal Data contained in the model output. We strongly advise downloading and applying this output filter from SNAI every six months following the release of the model.”

I can't imagine that this actually complies with the law.

tarruda 9/5/2025||
Is there any practical method to verify that the model was trained from the reported dataset?
lllllm 9/5/2025|
we released 81 intermediate checkpoints of the whole pretraining phase, and the code and data to reproduce. so full audit is surely possible - still it would depend on what you consider 'practical' here.
xdennis 7 days ago||
I don't get it. How is it open if you can't even access it without signing a contract?
balder1991 6 days ago|
Apparently it’s a university thing.
mistrial9 7 days ago||
regarding training data -- is the main base model here trained only in FineWeb-2 ? or is it more also ..
titaniumrain 9/2/2025||
seems a DOA
sschueller 9/3/2025|
How so?
titaniumrain 7 days ago||
900 downloads in 5 days. not a very good sign, believe it or not
habi 9/5/2025||
https://apertus.org/ exists since 15 years, interesting choice of name.
Kye 7 days ago|
>> "Middle English, borrowed from Latin apertūra, from apertus, past participle of aperīre "to open" + -ūra -ure"

https://www.merriam-webster.com/dictionary/aperture

habi 7 days ago||
Sure, I just (honestly) wondered about trademarks.
Kye 7 days ago||
https://apertus.org/about

>> "The "o" stands for "open", "openness", "open source" and is placed where a "TM" symbol (indicating patents, trademarks, protection) would normally reside. Instead openness is the apertus° trademark."

It's also a completely different kind of thing so trademark probably wouldn't come into it even if they had one.

cmdrk 9/3/2025|
Does their training corpus respect copyrights or do you have to follow their opt out procedure to keep them from consuming your data? Assuming it’s the latter, it’s open-er but still not quite there.
SparkyMcUnicorn 9/5/2025||
Your question is addressed in opening abstract: https://github.com/swiss-ai/apertus-tech-report/raw/refs/hea...

> Unlike many prior models that release weights without reproducible data pipelines or regard for content-owner rights, Apertus models are pretrained exclusively on openly available data, retroactively respecting robots.txt exclusions and filtering for copyrighted, non-permissive, toxic, and personally identifiable content.

traspler 9/5/2025||
Afaik they respect robots.txt on crawl and later when using the data they re-check the robots.txt and will exclude the data if the new robots.txt was updated to deny access. They have further data filtering bit for that you better check the technical report.