Comment by weird-eye-issue 18 days ago Yeah just run a LLM with over 100 billion parameters on a CPU. 3 comments weird-eye-issue Reply kristjansson 18 days ago 200 GB is an unfathomable amount of main memory for a CPU(with apologies for snark,) give gpt-oss-120b a try. It’s not fast at all, but it can generate on CPU. awestroke 18 days ago But it's incredibly incapable compared to SOTA models. OP wants high quality output but doesn't need it fast. Your suggestion would mean slow AND low quality output. kristjansson 17 days ago Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”
kristjansson 18 days ago 200 GB is an unfathomable amount of main memory for a CPU(with apologies for snark,) give gpt-oss-120b a try. It’s not fast at all, but it can generate on CPU. awestroke 18 days ago But it's incredibly incapable compared to SOTA models. OP wants high quality output but doesn't need it fast. Your suggestion would mean slow AND low quality output. kristjansson 17 days ago Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”
awestroke 18 days ago But it's incredibly incapable compared to SOTA models. OP wants high quality output but doesn't need it fast. Your suggestion would mean slow AND low quality output. kristjansson 17 days ago Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”
kristjansson 17 days ago Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”
200 GB is an unfathomable amount of main memory for a CPU
(with apologies for snark,) give gpt-oss-120b a try. It’s not fast at all, but it can generate on CPU.
But it's incredibly incapable compared to SOTA models. OP wants high quality output but doesn't need it fast. Your suggestion would mean slow AND low quality output.
Set your parameters to make that point then. “Yeah just run a 1T+ model on CPU”