← Back to context Comment by Bombthecat 21 hours ago Both of them look pretty old? 3 comments Bombthecat Reply cjsaltlake 21 hours ago code clash I think would be quite hard to game or contaminate unintentionally; considering that models need to compete against one another gertlabs 21 hours ago https://gertlabs.com already does this at scale.An industry-standard benchmark shouldn't be hosted or designed by a lab producing the models, regardless. Bombthecat 21 hours ago I mean the data / benchmarks
cjsaltlake 21 hours ago code clash I think would be quite hard to game or contaminate unintentionally; considering that models need to compete against one another gertlabs 21 hours ago https://gertlabs.com already does this at scale.An industry-standard benchmark shouldn't be hosted or designed by a lab producing the models, regardless. Bombthecat 21 hours ago I mean the data / benchmarks
gertlabs 21 hours ago https://gertlabs.com already does this at scale.An industry-standard benchmark shouldn't be hosted or designed by a lab producing the models, regardless.
code clash I think would be quite hard to game or contaminate unintentionally; considering that models need to compete against one another
https://gertlabs.com already does this at scale.
An industry-standard benchmark shouldn't be hosted or designed by a lab producing the models, regardless.
I mean the data / benchmarks