Comment by seanhunter

4 days ago

That's not how llm training and recall works at all so I'm not surprised you are not getting good results in this way. You would be much better using a conventional search engine or if you want to use an llm, use one with a search tool so it will use the search engine for you.

The problem you're encountering is not the model being unable to determine whether a quote it knows is responsive to your prompt but instead is a problem to do with recall in the model (which is not generally a task it's trained for). So it's not a similarity problem it's a recall problem.

When LLMs are trained on a particular document, they don't save a perfect copy somehow that they can fish out later. They use it to update their weights via backpropogation and are evaluated on their "sentence completion" task during the main phase of training or on a prompt response eval set during instruction fine tuning. Unless your quote is in that set or is part of the eval for the sentence completion task during the main training, there's no reason to suppose the LLM will particularly be able to recall it as it's not being trained to do that.

So what happens instead is the results of training on your quote update the weights in the model and that maybe somehow in some way that is quite mysterious results in some ability to recall it later but it's not a task it's evaluated on or trained for, so it's not surprising it's not great at it and in fact it's a wonder it can do it at all.

p.s. If you want to evaluate whether it is struggling with similarity, look up a quote and ask a model whether or not it's responsive to a given question. I.e. give it a prompt like this

   I want a quote about someone living the highlife during  the 1960s.  Do you think this quote by George Best does the job? “I spent a lot of money on booze, birds, and fast cars. The rest I just squandered.”