Comment by mlsu
5 hours ago
Imho at first blush this sounds fascinating and awesome and like it would indicate some higher-order spiritual oneness present in humanity that the model is discovering in its latent space.
However, it's far more likely that this attractor state comes from the post-training step. Which makes sense, they are steering the models to be positive, pleasant, helpful, etc. Different steering would cause different attractor states, this one happens to fall out of the "AI"/"User" dichotomy + "be positive, kind, etc" that is trained in. Very easy to see how this happens, no woo required.
No comments yet
Contribute on Hacker News ↗