Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS

Posted by MattHart88 17 hours ago

Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS(github.com)

I built this because I wanted to see how far I could get with a voice-to-text app that used 100% local models so no data left my computer. I've been using a ton for coding and emails. Experimenting with using it as a voice interface for my other agents too. 100% open-source MIT license, would love feedback, PRs, and ideas on where to take it.

405 points | 180 commentspage 3

raybb 13 hours ago|

Would also like to know how it compares to https://github.com/openwhispr/openwhispr

I like that openwhisper lets me do on device and set a remote provider.

ghm2199 10 hours ago||

I've been using handy since a month and its awesome. I mainly use it with coding agents or when I don't want to type into text boxes. How is this different?

Part of the reason handy is awesome is because it uses some of the same rust infra for integrating with the model, so that actually makes it possible to use the code as a library in android or iOS. I have an android app that runs on a local model on the phone too using this.

bambushu 5 hours ago||

nice to see this running fully local. what model size are you shipping as default, and what's the cold-start time on Apple Silicon? I've been using Whisper locally for meeting transcription and the biggest friction point is always endpoint detection - knowing when you've stopped talking vs pausing to think. curious how you handle that with hold-to-talk.

mathis 16 hours ago||

If you don't feel like downloading a large model, you can also use `yap dictate`. Yap leverages the built-in models exposed though Speech.framework on macOS 26 (Tahoe).

Project repo: https://github.com/finnvoor/yap

boudra 10 hours ago||

Interesting, I'm surprised you went with Whisper, I found Parakeet (v2) to be a lot more accurate and faster, but maybe it's just my accent.

I implemented fully local hands free coding with Parakeet and Kokoro: https://github.com/getpaseo/paseo

hyperhello 16 hours ago||

Feature request or beg: let me play a speech video and transcribe it for me.

MattHart88 16 hours ago|

I like this idea and it should work -- whatever microphone you have on should be able to hear the speaker. LMK if not (e.g., are you wearing headphones? if so, the mic can't hear the speaker)

rcarmo 15 hours ago||

Not sure why I should use this instead of the baked-in OS dictation features (which I use almost daily--just double-tap the world key, and you're there). What's the advantage?

rane 7 hours ago||

- Way more accurate, especially with technical jargon. Try saying JSON as part of a sentence to macOS dictation and see what comes out.

- macOS dictation mutes other sounds while it's running. This is a deal-breaker for me.

qq66 15 hours ago||

I haven't used this one but WisprFlow is vastly better than the built-in functionality on MacOS. Apple is way behind even startups, even for fundamental AI functionality like transcribing speech

ibero 15 hours ago|||

WisprFlow has a lot of good recommendations behind it but the fact they used Delve for SOC2 compliance gives me major pause.

janalsncm 14 hours ago|||

The fact that a company could slurp up all of your data and then use Delve for their SOC2 is a great reason to use local models.

jonwinstanley 15 hours ago|||

I use the baked in Apple transcription and haven't had any issues. But what I do is usually pretty simple.

What makes the others vastly better?

MattDamonSpace 14 hours ago||

I’ve rarely had macOS TTS produce a sentence I didn’t have to edit

Whisper models I barely bother checking anymore

pdyc 10 hours ago||

interesting, i wanted something like this but i am on linux so i modified whisper example to run on cli. Its quite basic, uses ctrl+alt+s to start/stop, when you stop it copies text to clipboard that's it. Now its my daily driver https://github.com/newbeelearn/whisper.cpp

jannniii 8 hours ago||

Oh dear, why does it not use apfel for cleanup? No model download necessary…

janalsncm 14 hours ago|

I think the jab at the bottom of the readme is referring to whispr flow?

https://wisprflow.ai/new-funding

More comments...