Comment by october8140

16 days ago

All these AI text to voice models seem to ignore emotion. It always sounds like a robot.

I wonder if AI could create a "commentary" script that instructs the TTS how to read certain words or chapters. The commentary would be like an additional meta-track to help the TTS make the best reading.

That should actually be possible to do already with existing tech. I haven't seen if you can instruct Kokoro to read in a certain way, does anyone know if this is possible?