Comment by mharrison

21 hours ago

Odd take. I'm running them locally at my desk (DGX Spark and 128GB MBP). They work fine for 90% of what most folks do. Admittedly, they do run slower on my hw than on the cloud.

8 comments

__mharrison__

pants2 21 hours ago

Running them locally is cool and has privacy/autonomy benefits, but you can't really make a value case for it. Guaranteed if you run the math you will never run enough inference to pay off your hardware vs buying tokens. Last time I ran the math on my MBP I'd have to run inference 24 hours a day for 5+ years to pay off the cost of my MBP, not accounting for electricity costs.

iooi 21 hours ago
Is this because of the tok/s? Since it's pretty easy to run up a $5k bill in API usage for Claude/ChatGPT in a month.
- pants2 21 hours ago
  
  Yes, because of the limits on tok/s, and you have to compare apples to apples, not Gemma 27B to Opus 4.7.
  
  3 replies →
slopinthebag 19 hours ago

The value of not having a reliance on a third party company, and not needing an internet connection, and having total privacy: ∞
fragmede 19 hours ago

Just have to put some numbers on privacy and autonomy. What's the fine to my company if I get hacked and leak all my customer's PII? What's the cost in productivity lost if OpenAI/Anthropic/Google decides to suspend my account for an unknown reason?

Comment by __mharrison__

Comment by mharrison