Comment by vlovich123

10 hours ago

You ultimately still need a jitter buffer large enough to absorb retransmisiones. Otherwise you’ve got stuttering audio. And dynamically adjusting this jitter buffer is hard

4 comments

vlovich123

davidkunz 9 hours ago

I'm not an expert. Can't we abuse that LLMs don't need to receive audio as a continuous stream without interruptions? Couldn't we just send data and pipe it into the LLM with deduplication (if resending happens)?

  x...y...y[dedup]...z

vlovich123 5 minutes ago

Audio -> ASR - no jitter buffer TTS -> human - jitter buffer
shwaj 8 hours ago

You’re absolutely correct. A jitter buffer is necessary for a human listener, but a LLM isn’t aware of a time lapse, just like it isn’t aware of the time since your last message in the conversion (unless the chat harness explicitly informs it).

fidotron 2 hours ago

> And dynamically adjusting this jitter buffer is hard

Unappreciated part of this entire conversation.