Comment by SXX
6 hours ago
Now we need someone try run Kimi K2.6 on old Xeon and DDR3. After all these platforms do support up to 768GB RAM.
6 hours ago
Now we need someone try run Kimi K2.6 on old Xeon and DDR3. After all these platforms do support up to 768GB RAM.
You can run these on a turing machine. At what point is it not worth it? At some point the energy to generate each token matters. We often seen token per second. I think a missing metric is tokens per kilowatt. That is what really matters.
It’ll work but yield a token per minute. With ancient servers the throughput is the limiting aspect not mem size