Comment by aleggg
2 days ago
Yes. This heavily subsidized LLM inference usage will not last forever.
We have already seen cost cutting for some models. A model starts strong, but over time the parent company switches to heavily quantized versions to save on compute costs.
Companies are bleeding money, and eventually this will need to adjust, even for a behemoth like Google.
That is why running local models is important.
No comments yet
Contribute on Hacker News ↗