Comment by yencabulator
6 months ago
And now you need a server per model? Ollama loads models on-demand, and terminates them after idle, all accessible over the same HTTP API.
6 months ago
And now you need a server per model? Ollama loads models on-demand, and terminates them after idle, all accessible over the same HTTP API.
No comments yet
Contribute on Hacker News ↗