Comment by dragonwriter
1 year ago
> There are problems that are easy for human beings but hard for current LLMs (and maybe impossible for them; no one knows). Examples include playing Wordle and predicting cellular automata (including Turing-complete ones like Rule 110). We don’t fully understand why current LLMs are bad at these tasks.
I thought we did know for things like playing Wordle, that its because they deal with words as sequence of tokens that correspond to whole words not sequences of letters, so a game that involves dealing with sequences of letters constrained to those that are valid words doesn’t match the way they process information?
> Providing an LLM with examples and step-by-step instructions in a prompt means the user is figuring out the “reasoning steps” and handing them to the LLM, instead of the LLM figuring them out by itself. We have “reasoning machines” that are intelligent but seem to be hitting fundamental limits we don’t understand.
But providing examples with different, contextually-appropriate sets of reasoning steps results can enable the model to choose its own, more-or-less appropriate, set of reasoning steps for particular questions not matching the examples.
> It’s unclear if better prompting and bigger models using existing attention mechanisms can achieve AGI.
Since there is no objective definition of AGI or test for it, there’s no basis for any meaningful speculation on what can or cannot achieve it; discussions about it are quasi-religious, not scientific.
Arriving at a generally accepted scientific definition of AGI might be difficult, but a more achievable goal might be to arrive at a scientific way to determine something is not AGI. And while I'm not an expert in the field, I would certainly think a strong contender for relevant criteria would be an inability to process information in a way other than the one a system was explicitly programmed to, even if the new way of processing information was very related to the pre-existing method. Most humans playing Wordle for the first time probably weren't used to thinking about words that way either, but they were able to adapt because they actually understand how letters and words work.
I'm sure one could train an LLM to be awesome at Wordle, but from an AGI perspective the fact that you'd have to do so proves it's not a path to AGI. The Wordle dominating LLM would presumably be perplexed by the next clever word game until trained on thinking about information that way, while a human doesn't need to absorb billions of examples to figure it out.
I was originally pretty bullish on LLMs, but now I'm equally convinced that while they probably have some interesting applications, they're a dead-end from a legitimate AGI perspective.
An LLM doesn't even see individual letters at all, because they get encoded into tokens before they are passed as input to the model. It doesn't make much sense to require reasoning with things that aren't even in the input as a requisite for intelligence.
That would be like an alien race that could see in an extra dimension, or see the non-visible light spectrum, presenting us with problems that we cannot even see and saying that we don't have AGI when we fail to solve them.
And yet ChatGPT 3.5 can tell me the nth letter of an arbitrary word…
4 replies →
"they're a dead-end from a legitimate AGI perspective"
Or another piece of the puzzle to achieve it. It might not be one true path, but a clever combination of existing working pieces where (different) LLMs are one or some of those pieces.
I believe there is also not only one way of thinking in the human brain, but my thought processes happen on different levels and maybe based on different mechanism. But as far as I know, we lack details.
What about an LLM that can't play wordle itself without being trained on it, but can write and use a wordle solver upon seeing the wordle rules?
I think "can recognize what tools are needed to solve a problem, build those tools, and use those tools" would count as a "path to AGI".
LLMs can’t reason but neither can the part of your brain that automatically completes the phrase “the sky is…”
"Since there is no objective definition of AGI or test for it, there’s no basis for any meaningful speculation on what can or cannot achieve it; discussions about it are quasi-religious, not scientific."
This is such a weird thing to say. Essentially _all_ scientific ideas are, at least to begin with, poorly defined. In fact, I'd argue that almost all scientific ideas remain poorly defined with the possible exception of _some_ of the basic concepts in physics. Scientific progress cannot be and is not predicated upon perfect definitions. For some reason when the topic of consciousness or AGI comes up around here, everyone commits a sort of "all or nothing" logical fallacy: absence of perfect knowledge is cast as total ignorance.
Yes. That absence of perfect definition was part of why Turing came with his famous test so long ago. His original paper is a great read!
What is the rough definition, then?
Sam Harris argues similarly in The Moral Landscape. There's this conception objective morality cannot exist outside of religion, because as soon as you're trying to prove one, philosophers rush with pedantic criticism that would render any domain of science invalid.
I kinda get where Sam Harris is coming from, but its kind of silly to call what he is talking about morality. As far as I can tell, Harris is just a moral skeptic who believes something like "we should get a bunch of people together to decide kind of what we want in the world and then rationally pursue those ends." But that is very different from morality as it was traditionally understood (eg, facts about behaviors which are objective in their assignment of good and bad).
I think one should feel comfortable arguing that AGI must be stateful and experience continuous time at least. Such that a plain old LLM is definitively not ever going to be AGI; but an LLM called in a do while true for loop might.
I don't understand why you believe it must experience continuous time. If you had a system which clearly could reason, which could learn new tasks on its own, which didn't hallucinate any more than humans do, but it was only active for the period required for it to complete an assigned task, and was completely dormant otherwise, why would that dormant period disqualify it as AGI? I agree that such a system should probably not be considered conscious, but I think it's an open question whether or not consciousness is required for intelligence.
Active for a period is still continuous during that period.
As opposed to “active when called”. A function, being called repeatedly over a length of time is reasonably “continuous” imo
2 replies →
I think its note worthy that humans actually fail this test... We have to go dormant for 8 hours every day.
6 replies →
A consistent stateful experience may be needed, but not sure about continuous time. I mean human consciousness doesn't do that.
Human consciousness does though, e.g. the flow state. F1 drivers are a good example.
We tend to not experience continuous time because we repeatedly get distracted by our thoughts, but entering the continuous stream of now is possible with practice and is one of the aims of many meditators.
2 replies →
I would argue it needs to be at least somewhat continuous. Perhaps discrete on some granularity but if something is just a function waiting to be called it’s not an intelligent entity. The entity is the calling itself.
I try my best not to experience continuous time for at least eight hours a day.
Then for at least eight hours a day you don’t qualify as a generally intelligent system.
4 replies →
Some good prompt-reply interactions are probably fed back in to subsequent training runs, so they're still stateful/have memory in a way, there's just a long delay.
That’s not the AGI’s state. That’s just some past information.
6 replies →
You could imagine an LLM being called in a loop with a prompt like
You observe: {new input}
You remember: {from previous output}
React to this in the following format:
My inner thoughts: [what do you think about the current state]
I want to remember: [information that is important for your future actions]
Things I do: [Actions you want to take]
Things I say: [What I want to say to the user]
...
Not sure if that would qualify as an AGI as we currently define it. Given a sufficiently good LLM with good reasoning capabilities such a setup might be able to It would be able to do many of the things we currently expect AGIs to be able to do (given a sufficiently good LLM with good reasoning capabilities), including planning and learning new knowledge and new skills (by collecting and storing positive and negative examples in its "memory"). But its learning would be limited, and I'm sure as soon as it exists we would agree that it's not AGI
This already exists (in a slightly different prompt format); it's the underlying idea behind ReAct: https://react-lm.github.io
As you say, I'm skeptical this counts as AGI. Although I admit that I don't have a particularly rock solid definition of what _would_ constitute true AGI.
(Author here). I tried creating something similar in order to solve wordle etc, and the interesting part is that it is insufficient still. That's part of the mystery.
It works better to give it access to functions to call for actions and remembering stuff, but this approach does provide some interesting results.
Regarding Wordle, it should be straightforward to make a token-based version of it, and I would assume that that has been tried. It seems the obvious thing to do when one is interested in the reasoning abilities necessary for Wordle.
That doesn't seem straightforward - although it's blind to letters because all it sees are tokens, it doesn't have much training data ABOUT tokens.
What parent is saying is that instead of asking the LLM to play a game of Wordle with tokens like TIME,LIME we ask it to play with tokens like T,I,M,E,L. This is easy to do.
6 replies →