Comment by Imnimo

2 years ago

I'd be curious to see if, in the 1-2% of cases where the linear probe fails to predict board occupancy, the LLM also predicts (or at least assigns non-trivial probability to) a corresponding illegal move. For example, if the linear probe incorrectly thinks there's a bishop on b4, does the LLM give more probability to illegal bishop moves along the corresponding diagonals than to other illegal bishop moves?