Comment by fabiospampinato
1 year ago
Apparently I can find some matches for games that start like that between very strong players [1], so my hypothesis that the model may just be predicting bad moves on purpose seems wobbly, although having stockfish at the lowest level play as the supposedly very strong opponent may still be throwing the model off somewhat. In the charts the first few moves the model makes seem decent, if I'm interpreting these charts right, and after a few of those things seem to start going wrong.
Either way it's worth repeating the experiment imo, tweaking some of these variables (prompt guidance, stockfish strength, starting position, the name of the supposed players, etc.).
[1]: https://www.365chess.com/search_result.php?search=1&p=1&m=8&...
Interesting thought the LLM isn’t trying to win, it’s trying to produce data like the input data. It’s quite rare for a very strong player to play a very weak one. If you feed it lots of weak moves it’ll best replicate the training data by following with weak moves.