Comment by bastawhiz
5 days ago
That's the point: why would you buy a device that's specifically not optimized to be used for 24/7 inference? It's expensive hardware that's not designed to be used in that situation! The power use for inference isn't especially good and you're not getting even a fraction of the benefit from the hardware that you're paying for.
> why would you buy a device that's specifically not optimized to be used for 24/7 inference
because it costs $1k-$2k instead of $10k-30k+ for optimized devices
Nobody is suggesting you buy a pair of A100s, which is what 15k gets you these days. Get a used 5090. And the author specifically priced the hardware at over 4k, which is double the 1-2k you're noting
Good question but people are doing it anyway. It's a fact that right now tons of people are buying Mac Minis specifically for this use case, to treat them as their personal data center for agents. The concept of "power use for inference" is foreign. Those people are the ones that motivated this blog post I think.