Comment by TeMPOraL
1 month ago
Yeah, and if anything, RL has a rep of being too good at this job, because of all the cases where it gamed a benchmark by picking up on some environmental factor the supervisors hadn't thought of (numerical instabilities, rounding, bugs, etc.).
My favourite is this one:
https://news.ycombinator.com/item?id=43113941
The ML version of Professor Farnsworth[1]:
It came to me in a dream, and I forgot it in another dream.
[1]: https://www.imdb.com/title/tt0584424/quotes/?item=qt0439248