Comment by amelius

8 months ago

I tried some TTS models a while ago, but I noticed that none of them allowed to put markup statements in the text. For example, it would be nice to do something like:

     Hey look! [enthusiastic] Should we tell the others? Maybe not ... [giggles]

etc.

In fact, I think this kind of thing is absolutely necessary if you want to use this to replace a voice actor.

2 comments

amelius

data-ottawa 8 months ago

Eleven labs has some models with support for that.

https://elevenlabs.io/blog/v3-audiotags