Humans and machines find good moves in different ways.
Most humans have fast pattern matching that is quite good at finding some reasonable moves.
There are also classes of moves that all humans will spot. (You just moved your bishop, now it’s pointing at my queen)
The problem is that stockfish scores all moves with a number based on how good the move is. You have no idea if a human would agree.
For example mis-calculating a series of trades 4 moves deep is a very human mistake, but it’s scored the same as moving the bishop to a square where it can easily be taken by a pawn. They both result in you being a bishop down. A nerfed stockfish bot is equally likely to play either of those moves.
You might think that you could have a list of dumb move types that the bot might play, but there are thousands of possible obviously dumb moves. This is a problem for machine learning.
I'd call it an approach issue: LLM vs brute-force lookahead.
An LLM is predicting what comes next per it's training set. If it's trained on human games then it should play like a human; if it's trained on Stockfish games, then it should play more like Stockfish.
Stockfish, or any chess engine using brute force lookahead, is just trying to find the optimal move - not copying any style of play - and it's moves are therefore sometimes going to look very un-human. Imagine if the human player is looking 10-15 moves ahead, but Stockfish 40-50 moves ahead... what looks good 40-50 moves out might be quite different than what looks good to the human.
Humans and machines find good moves in different ways.
Most humans have fast pattern matching that is quite good at finding some reasonable moves.
There are also classes of moves that all humans will spot. (You just moved your bishop, now it’s pointing at my queen)
The problem is that stockfish scores all moves with a number based on how good the move is. You have no idea if a human would agree.
For example mis-calculating a series of trades 4 moves deep is a very human mistake, but it’s scored the same as moving the bishop to a square where it can easily be taken by a pawn. They both result in you being a bishop down. A nerfed stockfish bot is equally likely to play either of those moves.
You might think that you could have a list of dumb move types that the bot might play, but there are thousands of possible obviously dumb moves. This is a problem for machine learning.
I'd call it an approach issue: LLM vs brute-force lookahead.
An LLM is predicting what comes next per it's training set. If it's trained on human games then it should play like a human; if it's trained on Stockfish games, then it should play more like Stockfish.
Stockfish, or any chess engine using brute force lookahead, is just trying to find the optimal move - not copying any style of play - and it's moves are therefore sometimes going to look very un-human. Imagine if the human player is looking 10-15 moves ahead, but Stockfish 40-50 moves ahead... what looks good 40-50 moves out might be quite different than what looks good to the human.