Top
Best
New

Posted by petewarden 8 hours ago

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3(github.com)
I wanted to share our new speech to text model, and the library to use them effectively. We're a small startup (six people, sub-$100k monthly GPU budget) so I'm proud of the work the team has done to create streaming STT models with lower word-error rates than OpenAI's largest Whisper model. Admittedly Large v3 is a couple of years old, but we're near the top the HF OpenASR leaderboard, even up against Nvidia's Parakeet family. Anyway, I'd love to get feedback on the models and software, and hear about what people might build with it.
177 points | 33 commentspage 2
asqueella 6 hours ago|
For those wondering about the language support, currently English, Arabic, Japanese, Korean, Mandarin, Spanish, Ukrainian, Vietnamese are available (most in Base size = 58M params)
starkparker 4 hours ago||
Implemented this to transcribe voice chat in a project and the streaming accuracy in English on this was unusable, even with the medium streaming model.
oezi 3 hours ago||
Do you also support timestamps the detected word or even down to characters?
pzo 7 hours ago||
haven't tested yet but I'm wondering how it will behave when talking about many IT jargon and tech acronyms. For those reason I had to mostly run LLM after STT but that was slowing done parakeet inference. Otherwise had problems to detect properly sometimes when talking about e.g. about CoreML, int8, fp16, half float, ARKit, AVFoundation, ONNX etc.
saltwounds 5 hours ago||
Streaming transcription is crazy fast on an M1. Would be great to use this as a local option versus Wispr Flow.
g-mork 7 hours ago||
How does this compare to Parakeet, which runs wonderfully on CPU?
raybb 2 hours ago||
fyi the typepad link in your bio is broken
sroussey 7 hours ago||
onnx models for browser possible?
alexnewman 6 hours ago||
If only it did Doric
lostmsu 7 hours ago|
How does it compare to Microsoft VibeVoice ASR https://news.ycombinator.com/item?id=46732776 ?
More comments...