Comment by lostmsu

2 years ago

I think this is a case of a person with a hammer seeing everything as nails. Attention is no more kernel mechanism than a form of matrix decomposition or even a bilinear form. It is similar but not quite the same to all of these things.