← Back to context

Comment by bluechair

3 months ago

I had this exact same thought yesterday.

I’d go so far as to add one more layer to monitor this one and stop adding layers. My thinking is that this meta awareness is all you need.

No data to back my hypothesis up. So take it for what it’s worth.

This is where I was headed but I think you said it better. Some kind of executive process monitoring the situation, the random stream of consciousness and the actual output. Looping back around to outdated psychology you have the ego which is the output (speech), the super ego is the executive process and the id is the <think>internal monologue</think>. This isn't the standard definition of those three but close enough.

My thought on the same guess being - all tokens live in same latent space or in many spaces and each logical units train separate of each other…?