Comment by mvkel
17 hours ago
Was just at the YC launch event for this. Haven't felt this much inspiration in a while. Incredible minds confronting on tech that will change our society.
I met a guy who, for fun, started working on ARC2, and as he got the number to go up in the eval, a novel way to more efficiently move a robotic arm emerged. All that to say: chasing evals per se can have tangible real world benefits.
Talking to the ARC folks tonight, it sounds like there will be an ARC-4,5,6,etc. I mean of course there will be.
But with them will be an increasing expectation that these models can eventually figure things out with zero context, and zero pretraining; you drop a brain into any problem and it'll figure out how to dig its way out.
That's really exciting.
>Talking to the ARC folks tonight, it sounds like there will be an ARC-4,5,6,etc. I mean of course there will be.
Quintessential goal post moving...
If you read the charter of the eval (or any eval, really), this statement is pretty silly.
The whole point of each eval version is to identify a chunk of challenges that humans do well that AI can't. When AI gets to ~80, you move to the next chunk. When you run out of challenges, you have AGI.
HN occasionally devolves into “supremely pedantic and nitpicky” mode. Today is one of those days.