Comment by whimsicalism
4 days ago
I imagine you have to start decoding many speculative completions in parallel to have true low latency.
4 days ago
I imagine you have to start decoding many speculative completions in parallel to have true low latency.
No comments yet
Contribute on Hacker News ↗