Comment by ody4242
10 days ago
I'm 100% sure that all providers are playing with the quantization, kv cache and other parameters of the models to be able to serve the demand. One of the biggest advantage of running a local model is that you get predictable behavior.
No comments yet
Contribute on Hacker News ↗