Comment by CuriouslyC

4 months ago

Vector embeddings are so overhyped. They're decent as a secondary signal, but they're expensive to compute and fragile. BM25 based solutions are more robust and WAY lower latency, at the cost of some accuracy loss vs hybrid solutions. You can get the majority of the lift from hybrid solutions with ingest time semantic expansion/reverse hyde type input annotation with a sparse embedding BM25 at a fraction of the computational cost.

1 comment

CuriouslyC

jongjong 4 months ago

But it's much cheaper to compute than inference, and also you only have to compute once for any content and reuse multiple times.