← Back to context

Comment by SomaticPirate

7 hours ago

inb4 this technique is subsumed into the next MoE model release

LLMs are evolving so fast I wouldn’t be surprised if this technique was not needed in <6 months

I don't think the MoE part has anything to do with it, but the current gen of multimoddal models can do thinking interleaved with autoregressive(?*) image-gen so it's probably not long before they bake this into the RL process, same way native thought obviated need for "think carefully step by step" prompts.