← Back to context

Comment by sambapa

1 year ago

I always wonder how LLMs will achieve superintelligence when they are, by definition, average.

This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:

- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.

- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).

In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.

  • But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.

    • I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).

      2 replies →

To this pedantic point, If the average written intelligence of all humans alive and dead is > the max intelligence of all live humans who are also willing/positioned to do the same task at the same time and at the same place.

But yeah, I don't think LLMs (the current core architecture) can provide super intelligence. I think it needs a bit more than next token prediction architecturally speaking.