← Back to context

Comment by vlovich123

10 hours ago

You ultimately still need a jitter buffer large enough to absorb retransmisiones. Otherwise you’ve got stuttering audio. And dynamically adjusting this jitter buffer is hard

I'm not an expert. Can't we abuse that LLMs don't need to receive audio as a continuous stream without interruptions? Couldn't we just send data and pipe it into the LLM with deduplication (if resending happens)?

  x...y...y[dedup]...z

  • You’re absolutely correct. A jitter buffer is necessary for a human listener, but a LLM isn’t aware of a time lapse, just like it isn’t aware of the time since your last message in the conversion (unless the chat harness explicitly informs it).

> And dynamically adjusting this jitter buffer is hard

Unappreciated part of this entire conversation.