This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:
- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.
- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).
In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.
But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.
I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).
To this pedantic point, If the average written intelligence of all humans alive and dead is > the max intelligence of all live humans who are also willing/positioned to do the same task at the same time and at the same place.
But yeah, I don't think LLMs (the current core architecture) can provide super intelligence. I think it needs a bit more than next token prediction architecturally speaking.
This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:
- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.
- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).
In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.
But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.
I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).
2 replies →
Where is this defined? I'll wait for your reponse.
In some math books about markov chains
To this pedantic point, If the average written intelligence of all humans alive and dead is > the max intelligence of all live humans who are also willing/positioned to do the same task at the same time and at the same place.
But yeah, I don't think LLMs (the current core architecture) can provide super intelligence. I think it needs a bit more than next token prediction architecturally speaking.