Comment by storus
8 hours ago
They significantly lowered latency compared to EPYC/Xeon, which is critical for streaming agents (e.g. text/audio/video agents).
8 hours ago
They significantly lowered latency compared to EPYC/Xeon, which is critical for streaming agents (e.g. text/audio/video agents).
What latency? How much is it compared to LLM inference speed?