Comment by green7ea
3 hours ago
It should be easy with a Q4 (quantization to 4 bits per weight) and a smallish context.
You won't have much RAM left over though :-/.
At Q4, ~20 GiB
3 hours ago
It should be easy with a Q4 (quantization to 4 bits per weight) and a smallish context.
You won't have much RAM left over though :-/.
At Q4, ~20 GiB
No comments yet
Contribute on Hacker News ↗