← Back to context Comment by libraryofbabel 2 days ago Oh yes! That's probably more important, in fact. 1 comment libraryofbabel Reply mnicky 2 days ago Well, I think that this is also answer to your question about the intuition.If the assymetry of K and Q stems from the direction of the softmax application, it must also be the reason for the names of the matrices :)And if you think about it, it makes sense that for each Key, weights to all of the Queries sum to 1 and not vice versa.So this is my only intuition for the K and Q names.(It may or may not be similar to the whole "db lookup thing"... I just don't use that one.)
mnicky 2 days ago Well, I think that this is also answer to your question about the intuition.If the assymetry of K and Q stems from the direction of the softmax application, it must also be the reason for the names of the matrices :)And if you think about it, it makes sense that for each Key, weights to all of the Queries sum to 1 and not vice versa.So this is my only intuition for the K and Q names.(It may or may not be similar to the whole "db lookup thing"... I just don't use that one.)
Well, I think that this is also answer to your question about the intuition.
If the assymetry of K and Q stems from the direction of the softmax application, it must also be the reason for the names of the matrices :)
And if you think about it, it makes sense that for each Key, weights to all of the Queries sum to 1 and not vice versa.
So this is my only intuition for the K and Q names.
(It may or may not be similar to the whole "db lookup thing"... I just don't use that one.)