← Back to context

Comment by luodaint

23 days ago

Metric that measures the quality beyond simple tokens count: correction loop frequency.

When grep does not find a file of interest, the agent does not fail; it will continue working on an incomplete context. For a monolingual code base, the miss rate is okay. In case of polylingual code (Python backend code and TypeScript frontend code), the problems emerge when it comes to querying for cross-file dependencies. Grep will return a route from the backend API. However, there is an interface in TypeScript that needs to be matched. Agent generates a response that does not fit the type. Correction cycle is one; two if the type conflict is ambiguous.

Combining grep with the understanding of semantic relations between files is a solution. Number of tokens saved is real but underestimates the actual benefit since fewer correction cycles are more valuable than tokens themselves.