Comment by SomaticPirate
7 hours ago
inb4 this technique is subsumed into the next MoE model release
LLMs are evolving so fast I wouldn’t be surprised if this technique was not needed in <6 months
7 hours ago
inb4 this technique is subsumed into the next MoE model release
LLMs are evolving so fast I wouldn’t be surprised if this technique was not needed in <6 months
I don't think the MoE part has anything to do with it, but the current gen of multimoddal models can do thinking interleaved with autoregressive(?*) image-gen so it's probably not long before they bake this into the RL process, same way native thought obviated need for "think carefully step by step" prompts.
LLMs are rather devolving at this point.