Comment by lmm

2 months ago

> I ask a human "predict what a mouse would do here". In an effort to understand why the prediction is sometimes wrong I ask "walk me through what the imaginary mouse is thinking". Upon examination I exclaim "aha! there's the error" but sadly it's not actually because the output prediction was not based on the thinking trace in any robust manner.

Is this meant to be an analogy for a human or an LLM? Where would it be different in the other case?