Comment by hmokiguess 3 months ago You should publish your evaluation set, that seems pretty interesting!What’s your favourite one? 4 comments hmokiguess Reply iainmerrick 3 months ago Why would you ask that? The whole point of making it private is to avoid it leaking into the training data. hmokiguess 3 months ago I thought open benchmarks helped, sorry, guess I was being naive. iainmerrick 3 months ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users! Espressosaurus 3 months ago Keeping tests private is the only way to keep them valid.
iainmerrick 3 months ago Why would you ask that? The whole point of making it private is to avoid it leaking into the training data. hmokiguess 3 months ago I thought open benchmarks helped, sorry, guess I was being naive. iainmerrick 3 months ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users!
hmokiguess 3 months ago I thought open benchmarks helped, sorry, guess I was being naive. iainmerrick 3 months ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users!
iainmerrick 3 months ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users!
Why would you ask that? The whole point of making it private is to avoid it leaking into the training data.
I thought open benchmarks helped, sorry, guess I was being naive.
Ha, sorry, I was a bit brusque there.
Open benchmarks do help, but they mostly help the vendors, not we the users!
Keeping tests private is the only way to keep them valid.