Comment by CamperBob2
6 days ago
Certainly the models have orders of magnitude more data available to them than the smartest human being who ever lived does/did. So we can assume that if the goal is "merely" superhuman intelligence, data is not a problem.
It might be a constraint on the evolution of godlike intelligence, or AGI. But at that point we're so far out in bong-hit territory that it will be impossible to say who's right or wrong about what's coming.
Has learning though "self-play" (like with AlphaZero etc) been demonstrated working for improving LLMs?
My understanding (which might be incorrect) is that this amounts to RLHF without the HF part, and is basically how DeepSeek-R1 was trained. I recall reading about OpenAI being butthurt^H^H^H^H^H^H^H^H concerned that their API might have been abused by the Chinese to train their own model.
Superhuman capability within the tasks that are well represented in the dataset, yes. If one takes the view that intelligence is the ability to solve novel problems (ref F. Chollet), then the amount of data alone might not take us to superhuman intelligence. At least without new breakthroughs in the construction of models or systems.
R1 managed to replicate a model on the level of one one they had access to. But as far as I know they did not improve on its predictive performance? They did improve in inference time, but that is another thing. The ability to replicate a model is well demonstrated and quite common practice for some years already, see teacher-student distillation.