Comment by eurekin
2 years ago
re.:
> The problem of "identification" was quickly solved by another engineering feat, which was to slap on "positional embeddings". As usual, this too didn't happen because there was a deep mathematical understanding. Rather, it was attempted and it worked.
Wasn't that tried, because of robotics?
It's a commonly solved issue, that a hand of a robot must know each joints orientation in space. Typically, each joint (a degree of freedom) has a rotary encoder built in. There is more than one type, but the "absolute" version fits the one used in positional embeddings:
https://www.akm.com/content/www/akm/global/en/products/rotat...
(full article: https://www.akm.com/global/en/products/rotation-angle-sensor... )
I find that parallel very fitting, since a positional embedding uses a sequence of sinusoidal shapes of increasing frequency. In the "learned positional embedding" gpt's (such as the gpt-2), where the network is free to use anything it would like to, seems that it actually learns the same pattern as the predefined one (albeit a little bit more wonky).
No comments yet
Contribute on Hacker News ↗