Comment by breadislove

2 days ago

Thats actually not correct. Embeddings can handle relationships like “without” or “not.” when trained for it. You need to scale up the training massively to make it generalize it well. The current version of Mixedbread Search supports negatives like "tshirt without stripes". You can check it out on our launch video [1]. We are working on a way more generalized model, which should be able to capture relationships, emotions and much more. The current models are just limited.

[1]: https://www.mixedbread.com/blog/mixedbread-search

I was referring specifically to popular embedding models like OpenAI’s and sentence-transformers, which (as far as I know) don’t reliably handle negation or emotional nuance, they mostly capture topical similarity.

I don’t know enough of the underlying math to say for sure whether embeddings can be trained to consistently represent negation, but when I tried the Mixedbread demo myself with a query like “winter landscapes without sun and trees”, it still showed me paintings with both sun and trees. So at least in its current form, it doesn’t seem to fully handle those semantic relationships yet.