Comment by leopoldj
6 days ago
For production use of open weight models I'd use something like Amazon Bedrock, Google Vertex AI (which uses vLLM), or on-prem vLLM/SGLang. But for a quick assessment of a model as a developer, Ollama Turbo looks appealing. I find Google GCP incredibly user hostile and a nightmare to navigate quotas and stuff.
No comments yet
Contribute on Hacker News ↗