Comment by whimsicalism
2 months ago
I imagine you have to start decoding many speculative completions in parallel to have true low latency.
2 months ago
I imagine you have to start decoding many speculative completions in parallel to have true low latency.
No comments yet
Contribute on Hacker News ↗