Comment by zmmmmm
16 hours ago
There's not in the end all that much point having more memory than you can compute on in a reasonable time. So I think probably the useful amount tops out in the 128GB range where you can still run a 70b model and get a useful token rate out of it.
No comments yet
Contribute on Hacker News ↗