Comment by hnuser123456

1 year ago

Well, in the case of the latter, there was a vaguely known glitch for driving on the nose that allowed for better speeds than possible on 4 wheels, but it would be completely uncontrollable to a human. He figured out how to break the problem down into steps that the NN could gradually learn piecewise, until he had cars racing around tracks while balancing on their nose.

It turned out to have learned to keep the car spinning on its nose for stability, and timing inputs to upset the spinning balance at the right moment to touch the ground with the tire to shoot off in a desired direction.

I think the overall lesson is that, to make useful machine learning, we must break our problems down into pieces small enough that an algorithm can truly "build up skills" and learn naturally, under the correct guidance.