Comment by andrepd

4 days ago

"Benchmarks" in AI are hilarious. These tools can't even solve problems which are moderately more difficult than something that has a geeks4geeks page, but according to these benchmarks they are all IOI gold medallists. What gives?

The benchmarks are created by humans. So are the training sets. It turns out the sorts of problems that humans like to benchmark with are also the sorts of problems humans like to discuss wherever that training set was scraped.

Well that and the whole field is filled with AI hypemen who "contribute" by asking ChatGPT about the quality and validity of some other GPT response.