Comment by Escapado
1 day ago
I agree with the sentiment but I wonder if a sufficiently large amount of sufficiently sophisticated benchmarks existed then I would be surprised if a model would only memorize those benchmarks while showing terrible real world performance. We are not there yet but maybe one day we will be.
No comments yet
Contribute on Hacker News ↗