← Back to context

Comment by throwaway81523

7 hours ago

Gad, they sure like to say "BM25" over and over again. That's a near worthless approach to result ranking. Doing any halfway ok job requires much more tuned and/or more powerful approaches.

3 comments

throwaway81523

Reply

throwaway7783 7 hours ago

Can you please elaborate why?

cpursley 7 hours ago

It's common to do a hybrid of BM25 with other fuzzy search or pgvector.

storus 6 hours ago

BM25 is quite bad and needs to be retrained for each corpus anew. SPLADEv2 is much better and there are even better sparse embeddings these days.