Comment by mike_hearn
5 hours ago
This result feels very intuitive. The early layers of a transformer can be thought of as understanding surface level things like syntax, how tokens group, which groups are entities and how to disambiguate them, etc. The last layers are in a sense decoding ideas into a selection of words, ensuring the grammar makes sense, that the text flows and is structured correctly, etc. The middle layers are where the abstract thought and manipulation of concepts is happening.
But for the tasks this paper uses for RL training, it's all about improving the way the net is manipulating concepts. So the middle layers are where the focus should be.
Note: RL is also used for tasks that aren't about conceptual manipulation, like instruct training. I bet that their result doesn't hold for that because the delta vs the foundation model is all about the selection of words and flow of the text, not the core understanding.
I keep thinking of the RYS (Repeat Yourself) experiment of simply looping some of the inner layers of LLMs for better results and wonder if any progress was made on it.
https://dnhkng.github.io/posts/rys/
Feels it should be straightforward to integrate in LLMs a network to control the looping. Or just duplicate entire blocks of layers after the initial training.
Yes, computing in latent space is a big thing now.
https://ouro-llm.github.io/