Comment by pzo
9 months ago
but this is still great trick if you want to reduce latency or inference speed even with local models e.g. in realtime chatbot
9 months ago
but this is still great trick if you want to reduce latency or inference speed even with local models e.g. in realtime chatbot
No comments yet
Contribute on Hacker News ↗