Top
Best
New

Posted by lastdong 9/3/2025

VibeVoice: A Frontier Open-Source Text-to-Speech Model(microsoft.github.io)
448 points | 170 commentspage 4
ml_basics 9/3/2025|
what's the relationship between this work and the recently announced voice models from Microsoft AI? https://microsoft.ai/news/two-new-in-house-models/
ehutch79 9/3/2025||
The examples are kind of off-putting. We're definitely in uncanny valley territory here.
nextworddev 9/3/2025||
Still haven’t found anything better than kokoro tts. Anyone know something better?
egorfine 9/3/2025||
[deleted - I'm an idiot]
x187463 9/3/2025|
Whisper is speech-to-text. VibeVoice is text-to-speech.
mpeg 9/3/2025|||
There is a text-to-speech version of whisper, but IMHO the quality is much worse than the demos of this model.
x187463 9/3/2025||
Are you referring to this?

https://github.com/WhisperSpeech/WhisperSpeech

Or is there some OpenAI official Whisper TTS?

mpeg 9/3/2025||
Yep, nothing official that I know, but that one is fairly popular so maybe they were referring to it (although AFAIK it's not frontier?)
egorfine 9/3/2025|||
I stand corrected
weeb 9/3/2025||
does anyone know of recent TTS options that let you specify IPA rather than written words? Azure lets you do this, but something local (and better than existing OS voices) would be great for my project.
andybug 9/3/2025|
I'm using Kokoro via https://github.com/remsky/Kokoro-FastAPI. It has a `generate_audio_from_phonemes()` endpoint that I'm sure maps to the Kokoro library if you want to use it directly.

My usage is for Chinese, but the phonemes it generated looked very much like IPA.

swiftcoder 9/3/2025||
Ah, yes, the Furious 7 soundtrack. Definitely something everyone recalls
closewith 9/3/2025|
The most popular song of the year from one of the most popular movie franchises that had been in the global news due to the death of its star. Probably the most memorable song from a soundtrack of the century so far.
agos 9/3/2025||
I'm Just Ken (Barbie), Skyfall, Let it Go (Frozen), Remember Me (Coco), Happy (from Despicable Me 2), a Star is Born (Shallow), are all arguably wayyyyy more memorable and these are just off the top of my head. We've had quite a few memorable songs in soundtracks this millennium.

edit: I had forgotten about Jai Ho (Slumdog Millionaire) and Lose Yourself (8 mile)

closewith 9/3/2025|||
It's obviously subjective, but in terms of numbers the only contender in that list is Let It Go, which had about 1/3rd the reach.

Nothing on that list - movies or songs - had the cultural impact of Furious 7 or See You Again.

ascorbic 9/3/2025|||
And most recently "Golden"
throwaw12 9/3/2025||
Will there be a support for SSML to have more control of conversation?
tehlike 9/3/2025||
The comments in the html code is chinese, which is very interesting.
Havoc 9/3/2025||
MIT license - very nice!
ComputerGuru 9/3/2025||
The application of known FOSS licenses to what is effectively a binary-only release is misleading and borderline meaningless.
Havoc 9/3/2025||
It is an unfortunate recycling of an existing regime that no doubt offends Stallman to his very core, but I wouldn't call it meaningless.

If you're in a company and need a model which one do you think you're getting past compliance & legal - the one that says MIT or the one that says "non-commercial use only"?

em-bee 9/3/2025|||
what does that mean in this context? it seems to depend on an LLM. so can i run this completely offline? if i have to sign up and pay for an LLM to make it work, then it's not really more useful than any other non-free system
watsonmusic 9/3/2025||
Microsoft is cool
lagniappe 9/3/2025|
Bots should never sing.
cindyllm 9/3/2025|
[dead]
More comments...