Comment by barrell
9 hours ago
Yeah Japanese TTS is a lot harder than it looks. I’m also building a language learning application, and constantly ran into incorrect readings. Eleven labs, eleven labs v3, OpenAI, play.ht, azure, google, Polly — I’ve tried them all. They are all really bad (more than 1/3 the expressions had an error in them somewhere).
It _is_ fixable though. It took me about a week, but I have yet to find a mistaken reading now. This also seems to just be the case with Japanese - most tonal languages seem to have the correct tones (I’m not qualified to comment on how natural the tones sound, but I have yet to find a mismatch like in Japanese)
No comments yet
Contribute on Hacker News ↗