Comment by mjburgess

2 years ago

There's many notions of "prediction" and "generalisation" -- the relevant ones here, which apply to NNs, are extremely limited. That's the problem with all this deceptive language -- it invites people to think NNs predict in the sense of simulate, and generalise in the sense of "apply across different effect domains".

NNs cannot apply a 'concept' across different 'effect' domains, because they have only one effect domain: the training data. They are just models of how the effect shows itself in that data.

This is why they do not have world models: they are not generalising data by building an effect-neutral model of something; theyre just modelling its effects.

Compare having a model of 3D vs. a model of shadows of a fixed number of 3D objects. NNs generalise in the sense that they can still predict for shadows similar to their training set. They cannot predict 3d; and with sufficiently novel objects, fail catastrophically.