Comment by sambapa

1 year ago

I always wonder how LLMs will achieve superintelligence when they are, by definition, average.

8 comments

sambapa

This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:

- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.

- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).

In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.

sambapa 1 year ago
But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.
- yunwal 1 year ago
  
  I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).
  
  2 replies →

drpossum 1 year ago

Where is this defined? I'll wait for your reponse.

sambapa 1 year ago

In some math books about markov chains

porridgeraisin 1 year ago

To this pedantic point, If the average written intelligence of all humans alive and dead is > the max intelligence of all live humans who are also willing/positioned to do the same task at the same time and at the same place.

But yeah, I don't think LLMs (the current core architecture) can provide super intelligence. I think it needs a bit more than next token prediction architecturally speaking.