← Back to context

Comment by openquery

7 hours ago

For 99% of people I don't see the usecase (except for privacy but that ship sailed a decade ago for the aforementioned 99%). If the argument is inference offline - the modern computing experience is basically all done through the browser anyway so I don't buy it.

GPUs for video games where you need low latency makes sense. Nvidia GeForce Now works but not for any serious gaming. But when it comes to LLMs at least, the 100ms latency between you and the Gemini API or whichever provider you use is negligible compared to the inference time.

What am I missing?

I'm sure giants like Microsoft would like to add more AI capabilities, and I'm also sure they would like to avoid running them on their own servers.

Another thing is that I wouldn’t expect LLMs to be free forever. One day, CEOs will decide that everyone has become accustomed to them - and that will be the first day of a subscription-based model and the last day of AI companies reporting financial losses.