Top
Best
New

Posted by T-A 1 day ago

Apertus – Open Foundation Model for Sovereign AI(apertvs.ai)
511 points | 170 commentspage 3
pizlonator 18 hours ago|
> compliant at scale

The jokes write themselves.

david_shi 20 hours ago||
These models don't seem very competitive, who's their target audience?
poplarsol 20 hours ago|
Europeans who fetishize "compliance".
markhahn 16 hours ago|||
residents of the universe who recognize the US as a supply-chain risk.

no, actually, from the docs it sounds mainly motivated by the country's unique linguistic requirements.

3997531578 20 hours ago|||
[dead]
dangoodmanUT 21 hours ago||
How are they going to be competitive with top models at 70B size?
kennywinker 17 hours ago|
Qwen et al shows size isn’t actually the only useful metric for an llm.
nisten 21 hours ago||
As an opesource AI researcher with a lot of models and datasets on huggingface I am very appreciative of these types of project but we are ignoring the elephant in the room here ( or lack of )

the swiss have no gpus

T-A 14 hours ago||
the Apertus model was trained on the Alps supercomputer, operational at CSCS since September 2024, a data center of over 10'000 top-of-the-line NVIDIA Grace-Hopper chips

https://log.alets.ch/110/

kennywinker 20 hours ago|||
How is this a real problem? Genuine question, because i don’t really understand the urgency of everyone buying up ram and gpus as prices for those skyrocket.

I can run the 8B version of this swiss-ai model on a ten year old GPU. For the larger one, $2000 consumer hardware can run it fine. Beyond that, there are plenty of places where time on a GPU can be rented, and if the model is good, there will be hardware to run it.

pu_pe 13 hours ago||
You can run it, but you can't train it. While this type of toy model could actually be trained in Swiss equipment, a state-of-the-art LLM probably could not.

My charitable reading of GP's point is that the bottleneck for true compute sovereignty is the chips, not the models.

khalic 8 hours ago|||
Do some research before posting that kind of stuff
markhahn 16 hours ago||
why do you say the Swiss have no gpus?
markab21 21 hours ago||
I'm mildly surprised that more people aren't using Nemo models for this reason. We've moved most of our processing to a combination of Nemo Ultra and Super, with some support for multi-model-specific tasks on Omni. The setup is working REALLY well for us, and I'm comfortable with the more measured pace of improvements. We work with many long-context problems, and the ecosystem is great.

There were a number of use cases where we needed to use Gemini (audio modality), and Ultra has been a VERY cost-effective alternative once we got through the nuances.

khalic 8 hours ago|
[dead]
firstrowraver 13 hours ago||
apertvs.ai? seriously?
andrewshadura 16 hours ago||
Not to be confused with Apertium and Apertis.
flixspiek 5 hours ago||
[flagged]
runnig 14 hours ago||
[dead]
jocelyner 13 hours ago|
[flagged]
More comments...