Comment by _giorgio_
2 years ago
> the paper that was most helpful for me was [Formal Algorithms for Transformers](https://arxiv.org/abs/2207.09238)
Interesting but hard to read since it uses a quite unique notations for matrix indexing and multplication. Why???
No comments yet
Contribute on Hacker News ↗