Comment by OutOfHere

2 days ago

Does OpenAI's text-embedding-3-large or text-embedding-3-small embedding model have the Matryoshka property?

1 comment

OutOfHere

They do, they just don't advertise it well (and only confirmed it with a footnote after criticism of its omission): https://openai.com/index/new-embedding-models-and-api-update...

> Both of our new embedding models were trained with a technique that allows developers to trade-off performance and cost of using embeddings. Specifically, developers can shorten embeddings (i.e. remove some numbers from the end of the sequence) without the embedding losing its concept-representing properties by passing in the dimensions API parameter. For example, on the MTEB benchmark, a text-embedding-3-large embedding can be shortened to a size of 256 while still outperforming an unshortened text-embedding-ada-002 embedding with a size of 1536.