Comment by walterbell
9 months ago
> While Qwen3 and DeepSeek are impressive, the infrastructure costs for running these at scale remain prohibitive for most use cases. The economics still don't work
dedicated LLM hosting providers like Cerebras and Groq who can actually make money on each user inference query
Cerebras (wafer-scale) and Groq (TPU+) both have inference-optimized custom hardware.
No comments yet
Contribute on Hacker News ↗