Comment by computerex

1 year ago

Question here is why gpt-3.5-instruct can then beat stockfish.

PS: I ran and as suspected got-3.5-turbo-instruct does not beat stockfish, it is not even close "Final Results: gpt-3.5-turbo-instruct: Wins=0, Losses=6, Draws=0, Rating=1500.00 stockfish: Wins=6, Losses=0, Draws=0, Rating=1500.00" https://www.loom.com/share/870ea03197b3471eaf7e26e9b17e1754?...

  • Maybe there's some difference in the setup because the OP reports that the model beats stockfish (how they had it configured) every single game.

Cheating (using a internal chess engine) would be the obvious reason to me.

The artical appears to have only run stockfish at low levels. you don't have to be very good to beat it

I'm actually surprised any of them manage to make legal moves throughout the game once out of book moves.