Top
Best
New

Posted by meetpateltech 13 hours ago

Voxtral Transcribe 2(mistral.ai)
761 points | 180 commentspage 5
scotty79 8 hours ago|
Do you know anything better for Polish language, low quality audio than Whisper large-v3 through WhisperX?

This combo has almost unbeatable accuracy and it rejects noises in the background really well. It can even reject people talking in the background.

The only better thing I've seen is Ursa model from Speechmatics. Not open weights unfortunately.

antirez 6 hours ago||
Disappointing how this lacks a clear reference implementation, if not mixed at almost yet unreleased VLLM (nightly version) stuff. I'm ok with Open Weights being a form of OSS in the case of models, because frankly I don't believe that, for large LLMs, it is feasible to release the training data, all the orchestration stuff, and so forth. But it can't be: here are the weights, we partnered with VLLM for inference. Come on. Open Weights must mean that you put me in a situation to write an implementation easily for any hardware.

p.s. even the demo uses a remote server via websocket.

dumpstate 11 hours ago||
I'm on voxtral-mini-latest and that's why I started seeing 500s today lol
boringg 11 hours ago||
Pseudo related -- am I the only one uncomfortable using my voice with AI for the concern that once it is in the training model it is forever reproducible? As a non-public person it seems like a risk vector (albeit small),
ffsm8 10 hours ago|
It's a real issue, but why do you only see it in ai? It's true for any case where you're speaking into a microphone

Depending on the permissions granted to apps on your mobile device, it can even be passively exfiltrated without you ever noticing - and that's ignoring the video clips people take and put online. Like your grandma uploading to Facebook a short moment from a Christmas meet or similar

There have already been successful scams - eg calls from "relatives" (AI) calling family members needing money urgently and convincing them to send the money...

varispeed 12 hours ago||
[flagged]
Empact 12 hours ago||
Many people speak Russian, including many who do not live in Russia, e.g. about 30% of Ukranians.

Beyond that, I don't see how we stand to durably reduce military action by making languages mutually unintelligible.

https://simple.wikipedia.org/wiki/Russian_language#/media/Fi...

laffOr 12 hours ago|||
Don't they have a partnership with the French Armed Forces? I am sure they are interested in automating Russian Audio or Text (-> Russian Text) -> French text.
varispeed 10 hours ago||
Fair point.
gostsamo 12 hours ago||
They've chosen languages which would help them to cover the highest percentage of human population..