Comment by wongarsu
8 hours ago
Anything below one billion parameters you can run on the CPU at acceptable speed
For larger sizes you still can, it just becomes slower and slower. For a simple classification task (small input, tiny output, and you can constrain output to a couple tokens) you could even run something like a 4B or 8B model on the CPU
No comments yet
Contribute on Hacker News ↗