Comment by karmasimida
4 days ago
> if OpenAI ran this 10000 times in parallel and cherry-picked the best one
This is almost certainly the case, remember the initial o3 ARC benchmark? I could add this is probably multi-agent system as well, so the context length restriction can be bypassed.
Overall, AI good at math problems doesn't make news to me. It is already better than 99.99% of humans, now it is better than 99.999% of us. So ... ?
No comments yet
Contribute on Hacker News ↗