Comment by sweezyjeezy
15 hours ago
Deep learning works at a very high level because 'it can keep learning from more data' better than any other approaches. But without the 'stupid amount of data' that is available now, the architecture would be kind of irrelevant. Unless you are going some way to explain both sides of the model-data equation I don't feel you have a solid basis to build a scientific theory, e.g. 'why reasoning models can reason'. The model is the product of both the architecture and training data.
My fear is that this is as hopeless right now as explaining why humans or other animals can learn certain things from their huge amount of input data. We'll gain better empirical understanding, but it won't ever be fundamental computer science again, because the giga-datasets are the fundamental complexity not the architecture.
No comments yet
Contribute on Hacker News ↗