Comment by levocardia

20 hours ago

My sense is that a powerful enough AI would have the sense to think something like "ah, this sounds like a video game! Let me code up an interactive GUI, test it for myself, then use it to solve these puzzles..." and essentially self-harness (the way you would if you were reading a geometry problem, by drawing it out on paper).

4 comments

levocardia

pawelk411 18 hours ago

Yeah but thats literally above ASI, let alone AGI. Average human scores <1% on this bench, opus scores 97.1% when given an actual vision access, which means agi was long ago achieved

vova_hn2 13 hours ago
> opus scores 97.1% when given an actual vision access
Do you have a source for this? I would be very curious to see how top models do with vision.
- famouswaffles 8 hours ago
  
  https://news.ycombinator.com/item?id=47532483
- daveguy 10 hours ago
  
  No, there is no source for this. Opus is scoring around 1% just like all the other frontier models. It would be fairly trivial to add a renderer intermediary. And if it improves to 97+%... Then you would get a huge cut of $2 million dollars. The assertion that Opus gets 97% if you just give it a gui is completely bogus.