Comment by kevmo314
1 day ago
I think this is a pretty big limitation of the architecture (STT->LLM->TTS) they've chosen. The intonation around struggling to speak or difficulty with certain phrases is totally lost when the text is transcribed.
1 day ago
I think this is a pretty big limitation of the architecture (STT->LLM->TTS) they've chosen. The intonation around struggling to speak or difficulty with certain phrases is totally lost when the text is transcribed.
No comments yet
Contribute on Hacker News ↗