Comment by jerf

3 years ago

Which is why I disclaimed it. I hate it when people quote things, cut off the quote, then bitch about the part they cut off.

No the models will not "predict Einstein". They'll predict the most popular interpretation of him at best, and while they is also a simplification, ChatGPT is not sitting on top of the solution to the Grand Unified Theory. It may give a good overview of the consensus, but it will not be able to tell you the correct solution to the problem right now... though it won't be hard to convince it to swear up and down that it has.

For ‘predict Einstein’ read ‘predict what an Einstein-level intellect would think (about a given subject)’, not literally replicate what Einstein did.

You’re focused on the idea of LLMs as collators of ‘things that have been said’, but that’s not all that they collate from their training set.

  • Yes, I'm aware of that too. Maybe I just don't feel like spewing a complete description of LLMs into my every post about them.

    To the extent that they may come up with novel ideas, they have no ability to compare them against the true state of the world. This is not exactly a limitation of them per se that could be overcome with more computation, so much as just a structural fact about them; they have no loop where they can form a hypothesis, test it, and adjust based on data. It simply doesn't exist.

    Which is part of why I keep saying that while I'm less impressed than everyone else is with LLMs, the future AIs that will incorporate them but not simply be an LLM is going to really knock people's socks off. Pretty much all the things people trying to convince LLMs to do that they really can't do are going to work in that generation. I have no idea if that generation is six months or six years away but I wouldn't bet much more than a few years.

  • >> but that’s not all that they collate from their training set

    OK, so what else is it?

    • LLMs learn to make predictions. They don't learn to imitate. They don't learn to simulate. There's nothing about learning to predict that makes the intelligence you gain constrained to the data you're learning from. The opposite if anything. But that's another argument for another time.

      The point I was making is that LLMs don't come out of training the average of what they've learned. They can make predictions on any state in their training.

      They can make predictions about the most intelligent state and the dumbest state. The most emotional state and the least emotional state. It's this powerful prediction range that makes them capable of imitating or simulating damn near anything in the training set to high and ever increasing accuracy.