← Back to context

Comment by eoinbmorg

14 hours ago

Is RAG the right tool for this? My understanding was that RAG uses vector similarity to compare queries (the extracted string) versus the search corpus (the PDF file) using vector similarities. The use case you describe is verification, which sounds like it would be better done with an exhaustive search via string comparison isntead of vector similarities.

I could be totally wrong here.

Some people define RAG as having to use vector search, others (myself included) define RAG as any technique that retrieves additional relevant context to help generate the response, which can include triggering things like full-text search queries or even grep (increasingly common thanks to Claude Code et al).

RAG is just "Retrieval Augmented Generation", vector similarity is one way to do that retrieval but not the only. Though you are right, there is really no retrieval step augmenting the generation here, more like just a validation step stuck on the end.

Though I imagine scenarios where the PDF is just an image (e.g. a scan of a form), and thus the validation would not work.