Comment by weird-eye-issue

18 days ago

Yeah just run a LLM with over 100 billion parameters on a CPU.

3 comments

weird-eye-issue

200 GB is an unfathomable amount of main memory for a CPU

(with apologies for snark,) give gpt-oss-120b a try. It’s not fast at all, but it can generate on CPU.

awestroke 18 days ago
But it's incredibly incapable compared to SOTA models. OP wants high quality output but doesn't need it fast. Your suggestion would mean slow AND low quality output.
- kristjansson 17 days ago
  
  Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”