Comment by uoaei

3 years ago

As hacky as it ends up being in practice, there are some pretty solid theoretical fundamentals to the field of statistical learning.

The problem is the theory is constrained either to the micro-scale (individual layers/"simple" models, etc.) or to the supra-scale (optimization/learning theory, etc.).

Not much concrete can be said about the macro-scale (individual networks) in theoretical terms, only that empirically they seem tend toward the things the supra-scale theory says they should do.

The current controversy in the academia v engineers tussle is 1) what exactly do the empirical results imply and 2) how much does the theory really matter given the practical outcomes. The only thing the two sides broadly agree upon is that some amount of error will always exist because NNs can be broadly understood as lossy compression machines.