Comment by iandanforth
2 days ago
For some perspective we have not yet scaled robot training. The amount of data that Pi is using to train their impressively capable robots is in the range of thousands of hours of data. In contrast language models are trained over trillions of tokens comprising the entirety of human knowledge. So if you're saying things like "this still seems hard" just remember we have yet to hit this with the data hammer. Simulation is proving a great way to augment / bootstrap robot dexterity but it still pales in comparison to data in the real world. So, as the author points out, we may get capability scaling like Waymo where one company painstakingly collects real data over a decade, but we may also see the rapid progress in simulators and simulator speed overtake for practical household / industrial tasks. My bet is on the latter.
> In contrast language models are trained over trillions of tokens comprising the entirety of human knowledge.
Not even close! At best it's a small subset of the internet + published books. The vast majority of human knowledge isn't even in the training sets yet.
I would question the use of a model fed everything, though.
Correct me if I'm wrong, but I haven't seen any simulator progress in years (e.g. MuJoCo hasn't changed in 5 years but is still SOTA accuracy)
MuJoCo, Drake, Pinocchio (and other simulators) are still improving (adding more accurate collision detection, better solvers etc).
How do they compare to PyBullet ?