Comment by fc417fc802
15 days ago
> is not an explanatory model of the weather (the weather is not a neural net)
I don't follow. Aren't those entirely separate things? The most accurate models of anything necessarily account for the underlying mechanisms. Perhaps I don't understand what you mean by "explanatory"?
Specifically in the case of deep neural networks, we would generally suppose that it had learned to model the underlying reality. In effect it is learning the rules of a sufficiently accurate simulation.
> The most accurate models of anything necessarily account for the underlying mechanisms
But they don't necessarily convey understanding to humans. Prediction is not explanation.
There is a difference between Einstein's General Theory of Relativity and a deep neural network that predicts gravity. The latter is virtually useless for understanding gravity (that's even if makes better predictions).
> Specifically in the case of deep neural networks, we would generally suppose that it had learned to model the underlying reality. In effect it is learning the rules of a sufficiently accurate simulation.
No, they just fit surface statistics, not underlying reality. Many physics phenomena were predicted using theories before they were observed, they would not be in the training data even though they were part of the underlying reality.
> No, they just fit surface statistics, not underlying reality.
I would dispute this claim. I would argue that as models become more accurate they necessarily more closely resemble the underlying phenomena which they seek to model. In other words, I would claim that as a model more closely matches those "surface statistics" it necessarily more closely resembles the underlying mechanisms that gave rise to them. I will admit that's just my intuition though - I don't have any means of rigorously proving such a claim.
I have yet to see an example where a more accurate model was conceptually simpler than the simplest known model at some lower level of accuracy. From an information theoretic angle I think it's similar to compression (something that ML also happens to be almost unbelievably good at). Related to this, I've seen it argued somewhere (I don't immediately recall where though) that learning (in both the ML and human sense) amounts to constructing a world model via compression and that rings true to me.
> Many physics phenomena were predicted using theories before they were observed
Sure, but what leads to those theories? They are invariably the result of attempting to more accurately model the things which we can observe. During the process of refining our existing models we predict new things that we've never seen and those predictions are then used to test the validity of the newly proposed models.
This is getting away from the original point which is that deep neural networks are, by default, not explanatory in the way Einstein's theory of relativity is.
But even so,
> In other words, I would claim that as a model more closely matches those "surface statistics" it necessarily more closely resembles the underlying mechanisms that gave rise to them.
I don't what it means, for example, for a deep neural network, to "more resemble" the underlying process of the weather. It's also obviously false in general: If you have a mechanical clock and quartz-crystal analog clock you are not going to be able to derive the internal workings of either or distinguish between them from the hand positions. The same is true for two different pseudo-random number generator circuits that produce the same output.
> I have yet to see an example where a more accurate model was conceptually simpler than the simplest known model at some lower level of accuracy.
I don't understand what you mean. Simple models often yield a high level of understanding without being better predictors. For example an idealized ball rolling down a plane, Galileo's mass/gravity thought experiment, Kepler etc. Many of these models ignore less important details to focus on the fundamental ones.
> From an information theoretic angle I think it's similar to compression (something that ML also happens to be almost unbelievably good at). Related to this, I've seen it argued somewhere (I don't immediately recall where though) that learning (in both the ML and human sense) amounts to constructing a world model via compression and that rings true to me.
In practice you get nowhere trying to recreate the internals of a cryptographic pseudo-random number generator from the output it produces (maybe in theory you could do it with infinite data and no bounds on computational complexity or something) even though the generator itself could be highly compressed.
> Sure, but what leads to those theories? They are invariably the result of attempting to more accurately model the things which we can observe.
Yes but if the model does not lead to understanding you cannot come up with the new ideas.
1 reply →