Comment by WithinReason
1 year ago
If you spent some time actually training networks you know that's not true, that's why batch norm, dropout, regularization is so successful. They don't increase the network's capacity (parameter count) but they increase its ability to learn.
No comments yet
Contribute on Hacker News ↗