Comment by amluto
1 day ago
If I’m coordinating a large codebase, I expect the people I’m coordinating to be capable of learning and improving over time. Coding agents cannot (currently) do this.
I wonder if a very lightweight RL loop built around the user could work well enough to help the situation. As I understand it, current LLMs generally do not learn at a rate such that one single bad RL example and one (prompted?) better example could result in improvement at anywhere near human speed.
No comments yet
Contribute on Hacker News ↗