Show HN: Semantic search over the National Gallery of Art

3 days ago (nga.demo.mixedbread.com)

How does this work? I thought it was probably powered by embeddings and maybe some more traditional search code, but I checked out the linked github repo and I didn't see any model/inference code. The public code is a wrapper that communicates with your commercial API?

Some searches work like magic and others seem to veer off target a lot. For example, "sculpture" and "watercolor" worked just about how I'd expect. "Lamb" showed lambs and sheep. But "otter" showed a random selection of animals.

  • It is powered by Mixedbread Search which is powered by our model Omni. Omni is multimodal (text, video, audio, images) and multi vector, which helps us to capture more information.

    The search is in beta and we improving the model. Thank you for reporting the queries which are not working well.

    Edit: Re the otter, I just checked and I did not found otters in the dataset. We should not return any results if the model is not sure to reduce confusion.

    • There's at least a little bit of otter in the data. The one relevant result I saw was "Plate 40: Two Otters and a Beaver" by Joris Hoefnagel.

      I also expected semantic search to return similar results for "fireworks" and "pyrotechnics," since the latter is a less common synonym for the former. But I got many results for fireworks and just one result for pyrotechnics.

      This is still impressive. My impulse is to poke at it with harder cases to try to reason about how it could be implemented. Thanks for your Show HN and for replying to me!

      1 reply →

This is neat, not sure how to report queries that are working poorly as you have mentioned. But when I search "Waltz" I am presented with Kitchen Utensils and only one piece of dancing folks. Presumably this is due to the Artist's name being 'Walton'.

  • We will add a feedback form tomorrow morning. For now please feel free to write to aamir at domain name of the page. thank you so much! this helps us a lot.

I recently learned that semantic search embeddings mostly represent topics and concepts, but they don’t handle negation or emotion very well.

For example, if you search for “paintings of winter landscapes but without sun and trees,” you’ll still get results with trees. That’s because embeddings capture the presence of concepts like “tree” or “landscape,” but not logical relationships like “without” or “not.”

Similarly, embeddings aren’t great at capturing how something feels. They can tell that “sad poem” and “happy poem” are different mainly because of the words used, not because they truly understand emotional tone.

This happens because most embedding models (like OpenAI’s or sentence-transformers) are trained to group things by semantic similarity, not logical meaning or sentiment. Negation, polarity, and affect aren’t explicitly represented in the vector space.

Might be common knowledge to some, but it was a cool TIL moment for me, realizing that embeddings are great at what something is about, but not how it feels or what it excludes.

  • Thats actually not correct. Embeddings can handle relationships like “without” or “not.” when trained for it. You need to scale up the training massively to make it generalize it well. The current version of Mixedbread Search supports negatives like "tshirt without stripes". You can check it out on our launch video [1]. We are working on a way more generalized model, which should be able to capture relationships, emotions and much more. The current models are just limited.

    [1]: https://www.mixedbread.com/blog/mixedbread-search

    • I was referring specifically to popular embedding models like OpenAI’s and sentence-transformers, which (as far as I know) don’t reliably handle negation or emotional nuance, they mostly capture topical similarity.

      I don’t know enough of the underlying math to say for sure whether embeddings can be trained to consistently represent negation, but when I tried the Mixedbread demo myself with a query like “winter landscapes without sun and trees”, it still showed me paintings with both sun and trees. So at least in its current form, it doesn’t seem to fully handle those semantic relationships yet.

I love old stereograms, and was happy to find a couple using this tool!

It would be nice if took you to the NGA page about the item. I cant even copy the text easily for easy search.

"Images of german shepherds" never fails to provide some humor.

Works really well for some artist names (rembrandt, whistler) and exceedingly poorly for others (john singer sargent).

love that a search for 'chill vibes sculpture' returned a very chill set of results. nice step change in art search capabilities

hey, your service is back up again!!! Mixedbread was my favorite tool for so long since your pivot, and I'm so glad y'all are back

  • We have a lot more things coming up soon. It just took us some time building Mixedbread Search.

Ketika kode dan kanvas bertemu — sebuah pencarian tak sekadar kata, tapi rasa. Di antara lukisan dan batang piksel, mesin mencoba memahami jawaban yang tak terucap.