← Back to context

Comment by observationist

1 month ago

https://arxiv.org/abs/2505.12540 https://arxiv.org/abs/2405.07987

These, other papers, and the lottery ticket phenomenon; what it boils down to is that any neural network like system which encodes some common mapping of a phenomenon in the context of the world - not necessarily a world model, but some "real-world thing" - will tend to map to a limited number of permutations of some archetypal representation, which will resemble other mappings of the same thing.

The lottery ticket phenomenon is a bit like the birthday paradox; there will be some number of structures in a large, random initialization of neural network weights that coincide with one or more archetypal mappings of complex objects. Some sub-networks are also useful mappings to features of one or more complex objects, which makes learning hierarchical nested networks of feature mappings easier; it's also why interpretability is so damned difficult.