Comment by vessenes
8 hours ago
Wrong question. I suggest:
1) Do models generalize?
2) If they do, and they generalize from this, is that a win?
Chollet was one of the first “they do not generalize” evangelists. I’d be curious to hear what he thinks now, because a) most disagree with him, and b) this test seems designed to get models that can generalize better at visual long context problem solving and agency, exactly where the bleeding edge is right now for needs with agentic systems.
Yeah, so you are agreeing that the benchmarks are useless because they don't answer those questions.
Can AI models generalize+ at any long context problem solving and agency regardless of modality? I think the answer is no, and this is why they are not yet AGI.
+ generalize being the key word.