Comment by tanaros
3 hours ago
Whenever somebody makes a benchmark, people complain that the benchmark results are meaningless because they’re gamed. I don’t know why those same people don’t understand that grading on vibes is strictly worse.
3 hours ago
Whenever somebody makes a benchmark, people complain that the benchmark results are meaningless because they’re gamed. I don’t know why those same people don’t understand that grading on vibes is strictly worse.
Depends on benchmark.
If questions are fixed they are trivial to game.