← Back to context

Comment by atleastoptimal

6 months ago

The problem is, any benchmark on a closed model couldn’t be private even in theory, as the model has to be called to run the benchmark, exposing the contents to whoever owns the model thereafter.

HN loves to speculate that OpenAI is some big scam whose seeming ascendance is based on deceptive marketing hype, but o1, to anyone who has tried it seriously is undoubtedly very much within the ballpark of what OpenAI claims it is able to do. If everything they are doing really is just overfitting and gaming the tests, that discrepancy will eventually catch up to them, and people will stop using the APIs and chatgpt