Comment by marlott

5 months ago

Interesting - can you elaborate a little on what you mean by agentic search here?

Since the Claude Code docs suggest installing Ripgrep, my guess is that they mean that Claude Code often runs searches to find snippets to improve in the context.

I would argue that this is still RAG. There's a common misconception (or at least I think it's a misconception) that RAG only counts if you used vector search - I like to expand the definition of RAG to include non-vector search (like Ripgrep in this case), or any other technique where you use Retrieval techniques to Augment the Generation phase.

IR (Information Retrieval) has been around for many decades before vector search become fashionable: https://en.wikipedia.org/wiki/Information_retrieval

  • I agree that retrieval can take many forms besides vector search, but do we really want to call it RAG if the model is directing the search using a tool call? That like an important distinction to me and the name "agentic search" makes a lot more sense IMHO.

    • Yes, I think that's RAG. It's Retrieval Augmented Generation - you're retrieving content to augment the generation.

      Who cares if you used vector search for the retrieval?

      The best vector retrieval implementations are already switching to a hybrid between vector and FTS, because it turns out BM25 etc is still a better algorithm for a lot of use-cases.

      "Agentic search" makes much less sense to me because the term "agentic" is so incredibly vague.

      4 replies →

  • rag is an acronym with a pinned meaning now. just like the word drone. drone didnt really mean drone, but drone means drone now. no amount of complaining will fix it. :[

I guess it's what sometimes it's called "self RAG", that is, the agent looks inside the files how a human would be to find that's relevant.

  • As opposed to vector search, or…?

    • To my knowledge these are the options:

      1. RAG: A simple model looks at the question, pulls up some associated data into the context and hopes that it helps.

      2. Self-RAG: The model "intentionally"/agentically triggers a lookup for some topic. This can be via a traditional RAG or just string search, ie. grep.

      3. Full Context: Just jam everything in the context window. The model uses its attention mechanism to pick out the parts it needs. Best but most expensive of the three, especially with repeated queries.

      Aider uses kind of a hybrid of 2 and 3: you specify files that go in the context, but Aider also uses Tree-Sitter to get a map of the entire codebase, ie. function headers, class definitions etc., that is provided in full. On that basis, the model can then request additional files to be added to the context.

      3 replies →

    • Does it make sense to use vector search for code? It's more for vague texts. In the code relevant parts can be found by exact name match. (in most cases. both methods aren't exclusive)

      1 reply →