Top
Best
New

Posted by meetpateltech 11 hours ago

Voxtral Transcribe 2(mistral.ai)
720 points | 174 commentspage 4
blobinabottle 7 hours ago||
Impressive results, tested on crappy audio files (in french and english)...
numbers 7 hours ago||
does anyone know if there's any desktop tools I can use this transcription model with? e.g. something where like Wisper Flow/WillowVoice but with custom model selection
tietjens 7 hours ago|
There is Handy, an open source project meant to be a desktop tool, but I haven’t installed it yet to see how you pick your model.

Handy – Free open source speech-to-text app https://github.com/cjpais/Handy

tallesborges92 7 hours ago||
I added it to my bot agent,let’s see how it performs
atentaten 6 hours ago||
Nice. Can this be ran on a mobile device?
derac 8 hours ago||
Any chance Voxtral Mini Transcribe 2 will ever be an open model?
asah 3 hours ago||
Smells Like Teen Spirit survives another challenge!

Voxtral Transcribe 2:

Light up our guns, bring your friends, it's fun to lose and to pretend. She's all the more selfish, sure to know how the dirty world. I wasn't what I'd be best before this gift I think best A little girl is always been Always will until again Well, the lights out, it's a stage And we are now entertainers. I'm just stupid and contagious. And we are now entertainers. I'm a lot of, I'm a final. I'm a skater, I'm a freak. Yeah! Hey! Yeah. And I forget just why I taste it Yeah, I guess it makes me smile I found it hard, it's hard to find the well Whatever, never mind Well, the lights out, it's a stage. You and I are now entertainers. I'm just stupid and contagious. You and I are now entertainers. I'm a lot of, I'm a minor. I'm a killer. I'm a beater. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. And I forget just why I taste it Yeah, I guess it makes me smile I found it hard, it's hard to find the well Whatever, never mind I know, I know, I know, I know, I know Well, the lights out, it's a stage. You and I are now entertainers. I'm just stupid and contagious. You and I are now entertainers. I'm a lot of, I'm a minor. I'm a killer. I'm a beater. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd. I'm a nerd.

Google/Musixmatch:

Load up on guns, bring your friends It's fun to lose and to pretend She's over-bored, and self-assured Oh no, I know a dirty word Hello, hello, hello, how low? Hello, hello, hello, how low? Hello, hello, hello, how low? Hello, hello, hello With the lights out, it's less dangerous Here we are now, entertain us I feel stupid and contagious Here we are now, entertain us A mulatto, an albino A mosquito, my libido, yeah Hey, yey I'm worse at what I do best And for this gift, I feel blessed Our little group has always been And always will until the end Hello, hello, hello, how low? Hello, hello, hello, how low? Hello, hello, hello, how low? Hello, hello, hello With the lights out, it's less dangerous Here we are now, entertain us I feel stupid and contagious Here we are now, entertain us A mulatto, an albino A mosquito, my libido, yeah Hey, yey And I forget just why I taste Oh yeah, I guess it makes me smile I found it hard, it's hard to find Oh well, whatever, never mind Hello, hello, hello, how low? Hello, hello, hello, how low? Hello, hello, hello, how low? Hello, hello, hello With the lights out, it's less dangerous Here we are now, entertain us I feel stupid and contagious Here we are now, entertain us A mulatto, an albino A mosquito, my libido A denial, a denial A denial, a denial A denial, a denial A denial, a denial A denial

asah 3 hours ago|
(when it was released, adults/press/etc. found SLTS famously incomprehensible and then they realized that the kids didn't understand the lyrics either, and Weird Al nailed it with his classic, Smells Like Nirvana: https://www.google.com/search?q=Smells+Like+Nirvana )
ewuhic 9 hours ago||
Can it translate in real time?
unstatusthequo 4 hours ago|
Also curious about this. Just need real time German to English. What does this?
antirez 4 hours ago||
Disappointing how this lacks a clear reference implementation, if not mixed at almost yet unreleased VLLM (nightly version) stuff. I'm ok with Open Weights being a form of OSS in the case of models, because frankly I don't believe that, for large LLMs, it is feasible to release the training data, all the orchestration stuff, and so forth. But it can't be: here are the weights, we partnered with VLLM for inference. Come on. Open Weights must mean that you put me in a situation to write an implementation easily for any hardware.

p.s. even the demo uses a remote server via websocket.

scotty79 6 hours ago|
Do you know anything better for Polish language, low quality audio than Whisper large-v3 through WhisperX?

This combo has almost unbeatable accuracy and it rejects noises in the background really well. It can even reject people talking in the background.

The only better thing I've seen is Ursa model from Speechmatics. Not open weights unfortunately.

More comments...