It sorta played chess- he let it generate up to ten moves, throwing away any that weren't legal, and if no legal move was generated by the 10th try he picked a random legal move. He does not say how many times he had to provide a random move, or how many times illegal moves were generated.
That was for the OpenAI games- including the ones that won. For the ones he ran himself with open source LLM's he restricted their grammar to just be legal moves, so it could only respond with a legal move. But that was because of a separate process he added on top of the LLM.
It sorta played chess- he let it generate up to ten moves, throwing away any that weren't legal, and if no legal move was generated by the 10th try he picked a random legal move. He does not say how many times he had to provide a random move, or how many times illegal moves were generated.
You're right it's not in this blog but turbo-instruct's chess ability has been pretty thoroughly tested and it does play chess.
https://github.com/adamkarvonen/chess_gpt_eval
Ah, I didn't see the ilegal move discarding.
That was for the OpenAI games- including the ones that won. For the ones he ran himself with open source LLM's he restricted their grammar to just be legal moves, so it could only respond with a legal move. But that was because of a separate process he added on top of the LLM.
Again, this isn't exactly HAL playing chess.