Comment by rapatel0
1 month ago
I forked it to also add rotorquant. This is a specific optimization that uses clifford rotors instead of static compile time random purmutation to store the activations. Reduces space and parameter count for the storage.
No comments yet
Contribute on Hacker News ↗