← Back to context

Comment by fc417fc802

2 months ago

I ask a human "predict what a mouse would do here". In an effort to understand why the prediction is sometimes wrong I ask "walk me through what the imaginary mouse is thinking". Upon examination I exclaim "aha! there's the error" but sadly it's not actually because the output prediction was not based on the thinking trace in any robust manner.

That's a loose analogy but it fails to fully illustrate the degree of decoupling here. For example the weirdness of LLM performance being increased via the output of empty sequences.

> I ask a human "predict what a mouse would do here". In an effort to understand why the prediction is sometimes wrong I ask "walk me through what the imaginary mouse is thinking". Upon examination I exclaim "aha! there's the error" but sadly it's not actually because the output prediction was not based on the thinking trace in any robust manner.

Is this meant to be an analogy for a human or an LLM? Where would it be different in the other case?