Comment by mariano54

1 day ago

Thanks! No, it cannot do this yet, as it's using a pipeline of voice to text to voice. I think models are heading in that direction, and voice to voice models are getting better.

For pitch accent, shadowing is a great way to improve. You can pause and repeat the tutors messages for example, or read out the word when doing flashcard reviews (copying the flashcard audio).