Comment by sebastiennight

1 year ago

This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:

- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.

- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).

In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.

4 comments

sebastiennight

sambapa 1 year ago

But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.

yunwal 1 year ago
I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).
- sambapa 1 year ago
  
  It was a figure of speech. But there is nothing superintelligent about acing Spanish tests. Give me a Riemann hypothesis.
  
  1 reply →