Comment by irthomasthomas
2 days ago
Depends. Something like arc-agi might be easy as it follows a defined format. I would also guess that the usage pattern for someone running a benchmark will be quite distinct from that of a normal user, unless they take specific measures to try to blend in.
No comments yet
Contribute on Hacker News ↗