Comment by merelysounds
2 days ago
Would the model owners be able to identify the benchmarking session among many other similar requests?
2 days ago
Would the model owners be able to identify the benchmarking session among many other similar requests?
Depends. Something like arc-agi might be easy as it follows a defined format. I would also guess that the usage pattern for someone running a benchmark will be quite distinct from that of a normal user, unless they take specific measures to try to blend in.