Comment by simonw
5 hours ago
You don't run coding agents for a week and THEN compile their code. The best available models would have no chance of that working - you're effectively asking them to one-shot a million lines of code with not a single mistake.
You have the agents compile the code every single step of the way, which is what this project did.
No comments yet
Contribute on Hacker News ↗