Comment by karmasimida

7 months ago

> if OpenAI ran this 10000 times in parallel and cherry-picked the best one

This is almost certainly the case, remember the initial o3 ARC benchmark? I could add this is probably multi-agent system as well, so the context length restriction can be bypassed.

Overall, AI good at math problems doesn't make news to me. It is already better than 99.99% of humans, now it is better than 99.999% of us. So ... ?

0 comments

karmasimida

No comments yet

Contribute on Hacker News ↗