Comment by yobbo

4 days ago

Embeddings represent more than P("found in the same context").

It is true that cosine similarity is unhelpful if you expect it to be a distance measure.

[0,0,1] and [0,1,0] are orthogonal (cosine 0) but have euclidean distance √2, and 1/3 of vector elements are identical.

It is better if embeddings encode also angles, absolute and relative distances in some meaningful way. Testing only cosine ignores all distances.

Modern embeddings lie on a hypersphere surface, making euclidean equal to cosine. And if they don't, I probably wouldn't want to use them.

  • True, on a hypersphere cosine and euclidean are equivalent.

    But if random embeddings are gaussian, they are distributed on a "cloud" around the hypersphere, so they are not equal.