Comment by viraptor
1 year ago
This is a puzzle given enough training information. LLM can successfully print out the status of the board after the given moves. It can also produce a not-terrible summary of the position and is able to list dangers at least one move ahead. Decent is subjective, but that should beat at least beginners. And the lowest level of stockfish used in the blog post is lowest intermediate.
I don't know really what level we should be thinking of here, but I don't see any reason to dismiss the idea. Also, it really depends on whether you're thinking of the current public implementations of the tech, or the LLM idea in general. If we wanted to get better results, we could feed it way more chess books and past game analysis.
LLMs like GPT aren’t built to play chess, and here’s why: they’re made for handling language, not playing games with strict rules and strategies. Chess engines, like Stockfish, are designed specifically for analyzing board positions and making the best moves, but LLMs don’t even "see" the board. They’re just guessing moves based on text patterns, without understanding the game itself.
Plus, LLMs have limited memory, so they struggle to remember previous moves in a long game. It’s like trying to play blindfolded! They’re great at explaining chess concepts or moves but not actually competing in a match.
> but LLMs don’t even "see" the board
This is a very vague claim, but they can reconstruct the board from the list of moves, which I would say proves this wrong.
> LLMs have limited memory
For the recent models this is not a problem for the chess example. You can feed whole books into them if you want to.
> so they struggle to remember previous moves
Chess is stateless with perfect information. Unless you're going for mind games, you don't need to remember previous moves.
> They’re great at explaining chess concepts or moves but not actually competing in a match.
What's the difference between a great explanation of a move and explaining every possible move then selecting the best one?
Chess is not stateless. En Passant requires last move and castling rights requires nearly all previous moves.
https://adamkarvonen.github.io/machine_learning/2024/01/03/c...
7 replies →
Chess is not stateless. Three repetitions of same position is a draw.
1 reply →
> Chess is stateless with perfect information. Unless you're going for mind games, you don't need to remember previous moves.
while it can be played as stateless, remembering previous moves gives you insight into potential strategy that is being build.
> Chess is stateless with perfect information.
It is not stateless, because good chess isn't played as a series of independent moves -- it's played as a series of moves connected to a player's strategy.
> What's the difference between a great explanation of a move and explaining every possible move then selecting the best one?
Continuing from the above, "best" in the latter sense involves understanding possible future moves after the next move.
Ergo, if I looked at all games with the current board state and chose the next move that won the most games, it'd be tactically sound but strategically ignorant.
Because many of those next moves were making that next move in support of some broader strategy.
4 replies →
You can feed them whole books, but they have trouble with recall for specific information in the middle of the context window.
>Chess is stateless with perfect information. Unless you're going for mind games, you don't need to remember previous moves.
In what sense is chess stateless? Question: is Rxa6 a legal move? You need board state to refer to in order to decide.
6 replies →
> they’re made for handling language, not playing games with strict rules and strategies
Here's the opposite theory: Language encodes objective reasoning (or at least, it does some of the time). A sufficiently large ANN trained on sufficiently large amounts of text will develop internal mechanisms of reasoning that can be applied to domains outside of language.
Based on what we are currently seeing LLMs do, I'm becoming more and more convinced that this is the correct picture.
I share this idea but from the different perspective. It doesn’t develop these mechanisms, but casts a high-dimensional-enough shadow of their effect on itself. This vaguely explains why the more deep Gell-Mann-wise you are the less sharp that shadow is, because specificity cuts off “reasoning” hyperplanes.
It’s hard to explain emerging mechanisms because of the nature of generation, which is one-pass sequential matrix reduction. I say this while waving my hands, but listen. Reasoning is similar to Turing complete algorithms, and what LLMs can become through training is similar to limited pushdown automata at best. I think this is a good conceptual handle for it.
“Line of thought” is an interesting way to loop the process back, but it doesn’t show that much improvement, afaiu, and still is finite.
Otoh, a chess player takes as much time and “loops” as they need to get the result (ignoring competitive time limits).
LLMs need to compress information to be able to predict next words in as many contexts as possible.
Chess moves are simply tokens as any other. Given enough chess training data, it would make sense to have part of the network trained to handle chess specifically instead of simply encoding basic lists of moves and follow-ups. The result would be a general purpose sub-network trained on chess.
Language is a game with strict rules and strategies.
just curious, was this rephrased by an llm or is that your writing style?
Stockfish level 1 is well below "lowest intermediate".
A friend of mine just started playing chess a few weeks ago and can beat it about 25% of the time.
It will hang pieces, and you can hang your own queen and there's about a 50% chance it won't be taken.