Comment by nok22kon
20 hours ago
paper is a bit old, but matches current empirical recommandation: a good starting point is the biggest model you can fit at 4 bit
20 hours ago
paper is a bit old, but matches current empirical recommandation: a good starting point is the biggest model you can fit at 4 bit
No comments yet
Contribute on Hacker News ↗