Comment by merelysounds

2 days ago

Would the model owners be able to identify the benchmarking session among many other similar requests?

Depends. Something like arc-agi might be easy as it follows a defined format. I would also guess that the usage pattern for someone running a benchmark will be quite distinct from that of a normal user, unless they take specific measures to try to blend in.