Comment by Scene_Cast2
1 year ago
Not quite an LLM. It's a transformer model, but there's no tokenizer or words, just chess board positions (64 tokens, one per board square). It's purpose-built for chess (never sees a word of text).
1 year ago
Not quite an LLM. It's a transformer model, but there's no tokenizer or words, just chess board positions (64 tokens, one per board square). It's purpose-built for chess (never sees a word of text).
In fact, the unusual aspect of this chess engine is not that it's using neural networks (even Stockfish does, these days!), but that it's only using neural networks.
Chess engines essentially do two things: Calculate the value of a given position for their side, and walking the tree game tree while evaluating its positions in that way.
Historically, position value was a handcrafted function using win/lose criteria (e.g. being able to give checkmate is infinitely good) and elaborate heuristics informed by real chess games, e.g. having more space on the board is good, having a high-value piece threatened by a low-value one is bad etc., and the strength of engines largely resulted from being able to "search the game tree" for good positions very broadly and deeply.
Recently, neural networks (trained on many simulated games) have been replacing these hand-crafted position evaluation functions, but there's still a ton of search going on. In other words, the networks are still largely "dumb but fast", and without deep search they'll lose against even a novice player.
This paper now presents a searchless chess engine, i.e. one who essentially "looks at the board once" and "intuits the best next move", without "calculating" resulting hypothetical positions at all. In the words of Capablanca, a chess world champion also cited in the paper: "I see only one move ahead, but it is always the correct one."
The fact that this is possible can be considered surprising, a testament to the power of transformers etc., but it does indeed have nothing to do with language or LLMs (other than that the best ones known to date are based on the same architecture).