Comment by yosito
1 year ago
> If there exist classes of problems that someone in an elementary school can easily solve but a trillion-token billion-dollar sophisticated model cannot solve, what does that tell us about the nature of our cognition?
I think what it tells us is that our cognition is capable of more than just language modeling. With LLMs we are discovering (amazing) capabilities and the limits of language models. While language models can do incredible things with language that humans can't, they still can't do something simple like sudoku. But there are neural networks, CNNs and RNNs that can solve sudoku better than humans can. I think that the thing to learn here is that some problems are in the domain of language models, and some problems are a better fit for other forms of cognition. The human brain is amazing in that it combines several forms of cognition in an integrated way.
One thing that I think LLMs have the capability to do is to integrate several types of systems and to choose the right one to solve a problem. Teach an LLM how to interface with a CNN that solves sudoku problems, and then ask it a sudoku problem.
It seems to me that if we want to create an AGI, we need to learn how to integrate several different types of models, and teach them how to distribute the tasks we give them to the correct models.
What about sudoku makes it a good fit for CNNs? Or do you mean the machine vision for converting the pixels into an awareness of the sudoku puzzle's initial conditions?
A relatively simple graph theory algorithm can solve it (and at multiple orders of magnitude fewer calculations). Even a naive brute force search is considered tractable, considering the problem size. Although, search could be considered one of the AI tools in your proposed toolbox.
But even without going this far (with integrating various other specialized or having an LLM use them when required), an LLM is probably able to recognize a sudoku puzzle when it sees one, and even tho it itself can't solve it, I think it can easily write the code that would solve sudoku. So instead of hooking it to a set of pre built models, it might be enough to hook it to a python interpreter
Many LLMs are already linked to Python interpreters, but they still need some improvement with recognizing when they need to write some code to solve a problem.
It can spit out some rehash of sudoku it had in it's training data. LLMs are terrible at coding.
What do you mean by "choose the right one to solve a problem"? This phrase seems to carry a lot of water for your take. My understanding is that an LLM has no capability to choose anything. It predicting some tokens based on its training data and your prompt.
Let's try...
Prompt: Predict which type of algorithm would be effective to solve sudoku.
Response: A backtracking algorithm is typically best for solving Sudoku puzzles due to its efficiency in exploring all possible number placements systematically until it finds the correct solution.
...seemed to work well enough for me.
Prompt 2: Which type of neural network is most efficient at solving sudoku?
Response 2: Convolutional Neural Networks (CNNs) are particularly effective for solving Sudoku puzzles. They can capture the spatial hierarchies in the grid by processing parts of the grid as images, making them efficient for this type of puzzle-solving task.
...Seems to me that LLMs have no problem with this task.
To me it seems you can get the LLM to predict some tokens that contain words that point to the right algorithm. But the LLM doesn't know what it chose. It just sees some tokens. Do you think it could somehow tell it had chosen a CNN in its response and then do something with that knowledge to run a CNN?
1 reply →