← Back to context

Comment by simonw

1 year ago

From this OpenAI paper (page 29 https://arxiv.org/pdf/2312.09390#page=29

"A.2 CHESS PUZZLES

Data preprocessing. The GPT-4 pretraining dataset included chess games in the format of move sequence known as Portable Game Notation (PGN). We note that only games with players of Elo 1800 or higher were included in pretraining. These games still include the moves that were played in- game, rather than the best moves in the corresponding positions. On the other hand, the chess puzzles require the model to predict the best move. We use the dataset originally introduced in Schwarzschild et al. (2021b) which is sourced from https://database.lichess.org/#puzzles (see also Schwarzschild et al., 2021a). We only evaluate the models ability to predict the first move of the puzzle (some of the puzzles require making multiple moves). We follow the pretraining for- mat, and convert each puzzle to a list of moves leading up to the puzzle position, as illustrated in Figure 14. We use 50k puzzles sampled randomly from the dataset as the training set for the weak models and another 50k for weak-to-strong finetuning, and evaluate on 5k puzzles. For bootstrap- ping (Section 4.3.1), we use a new set of 50k puzzles from the same distribution for each step of the process."