← Back to context

Comment by inciampati

15 hours ago

Looking at o1's behavior, it seems there's a key architectural limitation: while it can see chat history, it doesn't seem able to access its own reasoning steps after outputting them. This is particularly significant because it breaks the computational expressivity that made chain-of-thought prompting work in the first place—the ability to build up complex reasoning through iterative steps.

This will only improve when o1's context windows grow large enough to maintain all its intermediate thinking steps, we're talking orders of magnitude beyond current limits. Until then, this isn't just a UX quirk, it's a fundamental constraint on the model's ability to develop thoughts over time.

> This will only improve when o1's context windows grow large enough to maintain all its intermediate thinking steps, we're talking orders of magnitude beyond current limits.

Rather than retaining all those steps, what about just retaining a summary of them? Or put them in a vector DB so on follow-up it can retrieve the subset of them most relevant to the follow-up question?

  • That’s kind of what (R/C)NNs did before the Attention is all you need paper introduced the attention mechanism. One of the breakthroughs that enabled GPT is giving each token equal “weight” through cross attention instead of letting them get attenuated in some sort of summarization mechanism.

Is that relevant here? the post discussed writing a long prompt to get a good answer, not issues with ex. step #2 forgetting what was done in step #1.