← Back to context Comment by hmokiguess 1 month ago You should publish your evaluation set, that seems pretty interesting!What’s your favourite one? 4 comments hmokiguess Reply iainmerrick 1 month ago Why would you ask that? The whole point of making it private is to avoid it leaking into the training data. hmokiguess 1 month ago I thought open benchmarks helped, sorry, guess I was being naive. iainmerrick 1 month ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users! Espressosaurus 1 month ago Keeping tests private is the only way to keep them valid.
iainmerrick 1 month ago Why would you ask that? The whole point of making it private is to avoid it leaking into the training data. hmokiguess 1 month ago I thought open benchmarks helped, sorry, guess I was being naive. iainmerrick 1 month ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users!
hmokiguess 1 month ago I thought open benchmarks helped, sorry, guess I was being naive. iainmerrick 1 month ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users!
iainmerrick 1 month ago Ha, sorry, I was a bit brusque there.Open benchmarks do help, but they mostly help the vendors, not we the users!
Why would you ask that? The whole point of making it private is to avoid it leaking into the training data.
I thought open benchmarks helped, sorry, guess I was being naive.
Ha, sorry, I was a bit brusque there.
Open benchmarks do help, but they mostly help the vendors, not we the users!
Keeping tests private is the only way to keep them valid.