Comment by hellohello2

3 hours ago

I like how you frame it as on/off policy learning.

Clearly a poor way of studying with AI is: ask the AI to solve a problem, and try to follow through its solution yourself.

A better way is: try to solve a problem yourself, one that is ideally slightly too hard for you but not much more. When you hit a roadblock you can't solve, ask it for help.

In this way you are basically importance sampling information that you misunderstood, but are capable of understanding.