Comment by aleggg

2 months ago

Yes. This heavily subsidized LLM inference usage will not last forever.

We have already seen cost cutting for some models. A model starts strong, but over time the parent company switches to heavily quantized versions to save on compute costs.

Companies are bleeding money, and eventually this will need to adjust, even for a behemoth like Google.

That is why running local models is important.

0 comments

aleggg

No comments yet

Contribute on Hacker News ↗