Comment by andix
3 days ago
I wouldn't run local models on the development PC. Instead run them on a box in another room or another location. Less fan noise and it won't influence the performance of the pc you're working on.
Latency is not an issue at all for LLMs, even a few hundred ms won't matter.
It doesn't make a lot of sense to me, except when working offline while traveling.
Less of a concern these days with hardware like a Mac Studio or Nvidia dgx which are accessible and aren’t noisy at all.
I'm not fully convinced that those devices don't create noise at full power. But one issue still remains: LLMs eating up compute on the device you're working on. This will always be noticeable.