Comment by guccihat
5 months ago
> The Exercism problems have proven to be very effective at measuring an LLM's ability to modify existing code
The Aider Polyglot website also states that the benchmark " ...asks the LLM to edit source files to complete 225 coding exercises".
However, when looking at the actual tests [0], it is not about editing code bases, it's rather just solving simple programming exercies? What am I missing?
No comments yet
Contribute on Hacker News ↗