Comment by whimsicalism
5 days ago
I imagine you have to start decoding many speculative completions in parallel to have true low latency.
5 days ago
I imagine you have to start decoding many speculative completions in parallel to have true low latency.
No comments yet
Contribute on Hacker News ↗