Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library

Comment by Banditoz

19 hours ago

If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last Exam (https://agi.safe.ai/) this model uses and I can't seem to access it.

1 comment

Banditoz

Reply

johndough  18 hours ago

You can request access here: https://huggingface.co/datasets/cais/hle

The test data is purposely difficult to access to reduce the chance of leaking it into the training dataset.

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities