Comment by AlexCoventry
1 day ago
I have a nonlinear attention mechanism which seems to improve data efficiency, but it's slow. I'm trying to learn the python CuTe DSL to speed it up.
I'm also reading Principles and Practice of Deep Representation Learning, Or: A Mathematical Theory of Memory.
No comments yet
Contribute on Hacker News ↗