Comment by luodaint
23 days ago
Metric that measures the quality beyond simple tokens count: correction loop frequency.
When grep does not find a file of interest, the agent does not fail; it will continue working on an incomplete context. For a monolingual code base, the miss rate is okay. In case of polylingual code (Python backend code and TypeScript frontend code), the problems emerge when it comes to querying for cross-file dependencies. Grep will return a route from the backend API. However, there is an interface in TypeScript that needs to be matched. Agent generates a response that does not fit the type. Correction cycle is one; two if the type conflict is ambiguous.
Combining grep with the understanding of semantic relations between files is a solution. Number of tokens saved is real but underestimates the actual benefit since fewer correction cycles are more valuable than tokens themselves.
No comments yet
Contribute on Hacker News ↗