Comment by robbie-c 1 year ago Probably not, see https://www.anthropic.com/research/reasoning-models-dont-say... 1 comment robbie-c Reply kevinventullo 1 year ago Would be interesting to apply Interpretability techniques in order to understand how the model really reasons about it.
kevinventullo 1 year ago Would be interesting to apply Interpretability techniques in order to understand how the model really reasons about it.
Would be interesting to apply Interpretability techniques in order to understand how the model really reasons about it.