← Back to context

Comment by listic

2 days ago

Thanks for the advice! Could you please share how did you enable voice transcription for your setup and what it actually is?

I use https://github.com/braden-w/whispering with an OpenAI api key.

I use a keyboard shortcut to start and stop recording and it will put the transcription into the clipboard so I can paste into any app.

It's a huge productivity boost - OP is correct about not overthinking trying to be that coherent - the models are very good at knowing what you mean (Opus 4.5 with Claude Code in my case)

  • I just installed this app and it is very nice. The UX is very clean and whatever I say it transcribes it correctly. In fact I'm transcribing this comment with this app just now.

    I am using Whisper Medium. The only problem I see is that at the end of the message it sometimes puts a bye or a thank you which is kind of annoying.

  • I am all ready to believe that with LLMs it's not worth it trying to be too coherent: I did successfully use LLMs to make sense of what incoherent-sounding people say. (in text)

Aquavoice, YC company, really good. Got it after doing a bit of research on here, there's something for Mac that's supposed to be good too.

If you want local transcription, locally running models aren't quite good enough yet.

They use right-ctrl as their trigger. I've set mine to double tap and then I can talk with long pauses/thinking and it just keeps listening till I tap to finish.

I'm using Wispr flow, but I've also tried Superwhisper. Both are fine. I have a convenient hotkey to start/end recording with one hand. Having it just need one hand is nice. I'm using this with the Claude Code vscode extension in Cursor. If you go down this route, the Claude Code instance should be moved into a separate window outside your main editor or else it'll flicker a lot

  • another option is MacWhisper if someone is on macOS and doesn't want to pay for subscription (just one time payment) - pretty much all of those apps these days use paraspeech from NVIDIA which is the fastest and the best open source model that can run on edge devices.

    Also haven't tried but on latest MacOS 26 apple updated their STT models so their build in voice dictation maybe is good enough.