Comment by logicprog
1 year ago
Well said. This has always been my fundamental problem with the claims about large language models' current or eventual capabilities: most of the things people claim it can or will be able most of the things people claim it can or will be able to do require a neural architecture completely different from the one it has, and no amount of scaling up the number of neurons and the amount of training data used will change that fundamental architecture, and at a very basic level the capabilities of any neural network are going to be limited by its architecture. We would need to add some kind of advanced recursive structure to large language models, as well as some kind of short-term and working memory, as well as probably many other structures, to make them capable of the kind of metacognition necessary to properly do a lot of the things people want them to be able to do. Without metacognition, the ability to analyze what one is currently thinking and think new things based on that analysis, and therefore to look at what one is thinking and error correct it, consciously adjust it or iterate on it, or consciously ensure that one is adhering to certain principles of reasoning or knowledge, we can't expect large language models to be able to actually understand Concepts and principles and how they are applicable and reliably perform reasoning or even obey instructions.
No comments yet
Contribute on Hacker News ↗