Comment by PhilippGille
10 hours ago
OP asked:
> Is anyone doing true end-to-end speech models locally (streaming audio out), or is the SOTA still “streaming ASR + LLM + streaming TTS” glued together?
Your setup is the latter, not the former.
10 hours ago
OP asked:
> Is anyone doing true end-to-end speech models locally (streaming audio out), or is the SOTA still “streaming ASR + LLM + streaming TTS” glued together?
Your setup is the latter, not the former.
No comments yet
Contribute on Hacker News ↗