Comment by ClawsOnPaws

1 month ago

You are correct. At least in my case, more synthetic voices like Eloquence are easier to understand at high speeds especially because of their 'formulaic' nature. You don't listen to each individual phoneme or letter, you listen more for groups of syllables, tone, etc. The more unpredictable the text to speech, the harder this is. Also, performance is another big point. If you have large bits of silence at the beginning of the audio, or slow attacks, then the responsiveness will suffer, whether that's because of the actual audio itself, or the generation time.

Some of this is surely ssubjective, but I'm pretty sure I'm not the only screen reader user with these opinions.

0 comments

ClawsOnPaws

No comments yet

Contribute on Hacker News ↗