← Back to context

Comment by sanchitmonga22

3 months ago

The default TTS voice (Piper) is a lightweight model optimized for speed over quality. It's fast but yeah, it doesn't sound great.

If you install Kokoro TTS (rcli models > TTS section), the voice quality is dramatically better, it's a neural TTS model with 28 different voices. MetalRT synthesizes Kokoro at 178ms for short responses, so you don't pay a speed penalty for the upgrade.

We should probably make Kokoro the default or atleast make the upgrade path more obvious in the first-run experience. Fair feedback.