Comment by Bibabomas
23 days ago
The 98% is vs the grep+read loop, not grep output alone. When an agent hits an unfamiliar codebase it typically does "cat file" or reads the whole thing first, at least in my experience. If you're reliably getting agents to do "grep -C N" and stop there I'd genuinely be curious what your setup looks like, because I think the quality of the results is just too low to serve as useful context.
> When an agent hits an unfamiliar codebase it typically does "cat file" or reads the whole thing first, at least in my experience.
Depends on the size of the project and specific files. I have definitely seen agents make smart use of pi's "read" tool, which can take an offset and line limit (or defaults to a max 2000 lines/50KiB if the model doesn't specify). The bash tool also has the same max output, so if a model decides to cat instead of using the read tool it still wont blow out its context window with a single large file read.
But this sort of thing is going to vary with harness, model, project, and whatever the RNG delivers for the day.