Posted by lastdong 9/3/2025
Making it "open" would be unwise for a commercial entity. =3
For example, many academic data sets are not public domain, and can't be used in a commercial context. A GPL claim on that data is often an argument of which thief showed up first.
Rule #24: A lawyers Strategic Truth is to never lie, but also avoid voluntarily disclosing information that may help opponents.
Thus, a business will never disclose they paid a fool to break laws for them... =3
Indeed, these adversarial behaviors do not follow the spirit of FOSS community standards. If a project started as FOSS, than FOSS it should remain. =3
https://github.com/microsoft/VibeVoice
I was trying to get this working on strix halo.
Still not at the astonishing level of Google Notebook text to speech which has been out for a while now. I still can't believe how good that one is.
So that's a useful next step: for multi-voice TTS models, make them sound like they're in the same room.
I would love to have a model that can make sense of things like stressing particular syllables or phonemes to make a point.