Comment by mariano54

8 months ago

Thank you so much for that!

We focused on testing and tweaking the most popular ones, we have not tested some of the niche ones. We have removed languages that users have told us have major issues, but there are still some left.

The voices are due to the quality of the TTS services that we use. Openi, 11labs, minimax. Some services don't have many or even 1 good voice. We will add more over time

Sesame also passes in the users voice into the TTS model so that it can vibe well with the users tone and mood, whereas we are just using raw TTS. Their latency is also very low, but this is not quite suitable for language learning.

In the future we hope to move to full voice to voice models, once those become mature and intelligent enough.

0 comments

mariano54

No comments yet

Contribute on Hacker News ↗