Comment by gsuuon
4 days ago
I'm a little suspicious of the Isaac Newton example. The values of the better answer are very close, I wonder if the ordering holds up against small rewordings of the prompt?
Another approach if you're working with a local model is to ask for a summary of one word and then work with the resulting logits (wish I could find the article/paper that introduced this). You could compare similarity by just seeing how many shared words are in the top 500 of two queries, for example.
No comments yet
Contribute on Hacker News ↗