Comment by nazgul17
2 hours ago
Is my understanding right that the system is limited by the capability of teachers to solve their own problems? The TrueSkill of teachers doesn't seem to increase all that much, but I suppose TrueSkill works like that if the whole population gets better.
The teachers never attempt to solve their own problems, only the students solve problems.
Regarding the TrueSkill of the teachers, the self-play settings we operate in in this paper are zero-sum competitive which means that the population skills cannot both increase together, as the objective of one population is adversarial against the other -- generating difficult tasks (teachers) but making difficult tasks easy (students learning to solve them)