Comment by pizza
2 days ago
They do have a weak relationship, in that earlier index tokens were encountered earlier during the formation of the vocabulary, so they are similar in typicality
2 days ago
They do have a weak relationship, in that earlier index tokens were encountered earlier during the formation of the vocabulary, so they are similar in typicality
No, if you check the diagram (page 2) these are literally indexes into the KV vectors, not positional indexes in the text. If it was the text I would agree with you.
Oh, I thought you were talking about unorderedness in embedding indices in a general context, to which I was responding to the specific case of vocab embedding indices having a correlation - my apologies