Comment by woadwarrior01
6 months ago
> Serving models is currently expensive. I'd argue that some big cloud providers have conspired to make egress bandwidth expensive.
Cloudflare R2 has unlimited egress, and AFAIK, that's what ollama uses for hosting quantized model weights.
No comments yet
Contribute on Hacker News ↗