Comment by stingraycharles

7 months ago

The idea is that instead of assigning 10,000 thinking tokens to one chain of thought, assigning 1,000 thinking tokens to 10 chains of thought and composing those independent outputs into a single output yields better results.

The fact that it can be done in parallel is just a bonus.