Comment by sdfsefsdf

16 hours ago

Perhaps I've been deep in my own issues for too long, but it seems to me that the author is trying to say "don't trust the current evaluation suites too much"; scores only reflect a small part of the problem. What's interesting is discovering a new, stable evaluation metric, doing something new based on it, and having that new thing yield some unexpected intelligent results

1 comment

sdfsefsdf

jxmorris12 37 minutes ago

This is certainly part of it! My point was that focusing on problems proposed by others is one very specific and pretty short-term mode of thinking. Good researchers improve benchmark scores. Great researchers think about what problem they're solving.