← Back to context

Comment by bwfan123

6 days ago

Did humans formalize the inputs ? or was the exact natural language input provided to the llm. A lot of detail is missing on the methodology used. Not to mention of any independent validation.

My skepticism stems from the past frontier math announcement which turned out to be a bluff.

People are reading a lot into the FrontierMath articles from a couple months ago, but tbh I don’t really understand what the controversy is supposed to be there. failing to clearly disclose sponsoring Epoch to make the benchmark clearly doesn’t affect performance of a model on it