Comment by crooked-v
10 months ago
I think the real power of customized small models will be running things on local hardware, except that we're in an awkward phase where the local hardware isn't quite beefy enough to run anything really useful yet. Maybe Apple will do something interesting in that space at WWDC.
Also not feasible. A network request to groq type machines will outperform your local hardware by such a huge amount that it wont make sense other than some very niche tasks
Except nobody but groq has such type of machines, and the economics of cloud AI is very hard to make it works in practice. Offloading the capital cost (which is the hardest kind of cost to swallow for a company) to customers is very compelling business-wise.
What they are doing is not very special soon a lot of companies will do it. Lets call it end to end LLM hardware
1 reply →
Network availability, latency, privacy, etc. many qualities to consider beyond model size and performance for applications.
And cost-efficiency, if I'm using an LLM as an Siri-like assistant on my phone, most of the tasks I'll want it to do won't be that complicated and it would be a waste to send them to some SOTA LLM in the cloud, which I'll have to pay for by a monthly subscription or on a per-token basis.