← Back to context

Comment by pu_pe

23 days ago

> uses 98% fewer tokens than grep

So are we supposed to believe that grep is so wasteful that models are reading 98% useless garbage every time they call it? Either this claim is not representative, or you're missing something else when you throw away the vast majority of context for the model.

The 98% is vs the grep+read loop, not grep output alone. When an agent hits an unfamiliar codebase it typically does "cat file" or reads the whole thing first, at least in my experience. If you're reliably getting agents to do "grep -C N" and stop there I'd genuinely be curious what your setup looks like, because I think the quality of the results is just too low to serve as useful context.

  • > When an agent hits an unfamiliar codebase it typically does "cat file" or reads the whole thing first, at least in my experience.

    Depends on the size of the project and specific files. I have definitely seen agents make smart use of pi's "read" tool, which can take an offset and line limit (or defaults to a max 2000 lines/50KiB if the model doesn't specify). The bash tool also has the same max output, so if a model decides to cat instead of using the read tool it still wont blow out its context window with a single large file read.

    But this sort of thing is going to vary with harness, model, project, and whatever the RNG delivers for the day.

I had problems with Claude reading hundreds of kilobytes of outputs because grep found things in node_modules. (ripgrep helps, so it makes sense to add a line about it into some memory file.)

Grep prints out every matching line. For some searches a LLM might do it will get a lot of noise, and it might have to make that search because it cannot be specific. Targeted search can reduce the number of tokens.

I suspect this comparison is against reading the whole codebase though compared to just getting the bits you need.