← Back to context

Comment by codeulike

2 months ago

Wait, so the trick is they reach into the context and basically switch '</think>' with 'wait' and that makes it carry on thinking?

Not sure if your pun was intended, but 'wait' probably works so well because of the models being trained on text structured like your comment, where "wait" is followed by a deeper understanding.

Yes, that's explicitly mentioned in the blog post:

>In s1, when the LLM tries to stop thinking with "</think>", they force it to keep going by replacing it with "Wait".