Comment by BobbyJo
11 hours ago
Had this exact idea recently, applied to various software tooling. I think agents of all types are going to follow a similar path to self-driving cars: first 80% comes in a big boom, and the last 20% comes over a decade of training and simulations.
I think each agent use case is going to need a simulation for its reward to eek out the last 20%.
Edit: Realized I forgot to say Great Work! Looks Cool!
Self driving cars are a really good place to derive intuitions. Robotics as well!
Both those spaces are still optimizing on the last mile performance gains that get exponentially harder.
The good thing about computer use is building software environments are faster and also more repeatable, so hopefully we see quicker improvements here. :)