Comment by lysecret

2 years ago

To me this is more of a negative for DL based similarity than a win for that method.

With this whole LLM craze (and they are incredible) I think a lot of people just assume we made similar advancements on the Embedding layer for pure text similarity.

Thus, all this embeddings db bonanza. But as far as I can see there is close to no evidence for that.

https://twitter.com/eugeneyan/status/1678060204943097863

>When Deepmind needs semantic retrieval, they just use the largest index on the planet.

Fun fact: Query-doc similarity was done via simple TF-IDF instead of vectors. It performed better than vector retrieval when retrieve docs > 45 (they used 50).

https://blog.vespa.ai/improving-zero-shot-ranking-with-vespa...

>This case illustrates that in-domain effectiveness does not necessarily transfer to an out-of-domain zero-shot application of the model. Generally, as observed on the BEIR dense leaderboard, dense embeddings models trained on NQ labels underperform the BM25 baseline across almost all BEIR datasets.

Could you answer a question please? To make a text embedding with a LLM, the kind you would use for similarity metrics, which layer is used? The input layer? Input layer + positional encoding? A hidden layer? The output layer?