Comment by cranium
4 days ago
That's also why HyDE (Hypothetical Document Embeddings) can work better when the context isn't clear. Instead of embedding the user question directly – and risk retrieving chunks that look like the question – you ask a LLM to hallucinate an answer and use that to retrieve relevant chunks. Obviously, the hallucinated bits are never used afterwards.
AFAIK retrieving documents that look like the query is more commonly avoided by using a bi-encoder explicitly trained for retrieval, those generally are conditioned to align embeddings of queries to those of relevant documents, with each having a dedicated token marker, something like [QUERY] and [DOC], to make the distinction clear. The strong suit of HyDE seems to be more in working better in settings where the documents and queries you're working with are too niche to be properly understood by a generic retrieval model and you don't have enough concrete retrieval data to fine-tune a specialized model.