Apertus – Open Foundation Model for Sovereign AI

Posted by T-A 1 day ago

Apertus – Open Foundation Model for Sovereign AI(apertvs.ai)

511 points | 170 commentspage 3

pizlonator 18 hours ago|

> compliant at scale

The jokes write themselves.

david_shi 20 hours ago||

These models don't seem very competitive, who's their target audience?

poplarsol 20 hours ago|

Europeans who fetishize "compliance".

markhahn 16 hours ago|||

residents of the universe who recognize the US as a supply-chain risk.

no, actually, from the docs it sounds mainly motivated by the country's unique linguistic requirements.

3997531578 20 hours ago|||

[dead]

dangoodmanUT 21 hours ago||

How are they going to be competitive with top models at 70B size?

kennywinker 17 hours ago|

Qwen et al shows size isn’t actually the only useful metric for an llm.

nisten 21 hours ago||

As an opesource AI researcher with a lot of models and datasets on huggingface I am very appreciative of these types of project but we are ignoring the elephant in the room here ( or lack of )

the swiss have no gpus

T-A 14 hours ago||

the Apertus model was trained on the Alps supercomputer, operational at CSCS since September 2024, a data center of over 10'000 top-of-the-line NVIDIA Grace-Hopper chips

https://log.alets.ch/110/

kennywinker 20 hours ago|||

How is this a real problem? Genuine question, because i don’t really understand the urgency of everyone buying up ram and gpus as prices for those skyrocket.

I can run the 8B version of this swiss-ai model on a ten year old GPU. For the larger one, $2000 consumer hardware can run it fine. Beyond that, there are plenty of places where time on a GPU can be rented, and if the model is good, there will be hardware to run it.

pu_pe 13 hours ago||

You can run it, but you can't train it. While this type of toy model could actually be trained in Swiss equipment, a state-of-the-art LLM probably could not.

My charitable reading of GP's point is that the bottleneck for true compute sovereignty is the chips, not the models.

khalic 8 hours ago|||

Do some research before posting that kind of stuff

markhahn 16 hours ago||

why do you say the Swiss have no gpus?

markab21 21 hours ago||

I'm mildly surprised that more people aren't using Nemo models for this reason. We've moved most of our processing to a combination of Nemo Ultra and Super, with some support for multi-model-specific tasks on Omni. The setup is working REALLY well for us, and I'm comfortable with the more measured pace of improvements. We work with many long-context problems, and the ecosystem is great.

There were a number of use cases where we needed to use Gemini (audio modality), and Ultra has been a VERY cost-effective alternative once we got through the nuances.

khalic 8 hours ago|

[dead]

firstrowraver 13 hours ago||

apertvs.ai? seriously?

andrewshadura 16 hours ago||

Not to be confused with Apertium and Apertis.

flixspiek 5 hours ago||

[flagged]

runnig 14 hours ago||

[dead]

jocelyner 13 hours ago|

[flagged]

More comments...