500 are the transcription/llm/tts steps (ie the response time from data arriving on the server to sending back), the rest seems to be various non-AI "overheads" such as encoding, network traffic,etc.
The latencies in the table are based on heuristics or averages that we’ve observed. However, in reality, based on the conversation, some of the larger latency components can be much lower.
That's called marketing
My tests had one outlier at 1400ms, and ten or so between 400-500ms. I think the marketing numbers were fair.
500 are the transcription/llm/tts steps (ie the response time from data arriving on the server to sending back), the rest seems to be various non-AI "overheads" such as encoding, network traffic,etc.
The latencies in the table are based on heuristics or averages that we’ve observed. However, in reality, based on the conversation, some of the larger latency components can be much lower.