Comment by shwaj

7 months ago

In this context I believe it refers to models that are trained to generate an internal dialogue that is then fed back in as additional input. This cycle might be performed several times before generating the final output text.

This is in contrast to the way that GPT-2/3/“original 4” work, which is by repeatedly generating the next finalized token based on the full dialogue thus far.