Comment by orbital-decay

2 months ago

>Aren't you agreeing with his point? ... Nature is better at compression than ML researchers, by a long shot.

What I mean is basically the opposite. Nature not better as in more efficient. It just had a lot more time and scale to do it in an inefficient way. The reason we're learning quickly is that we can leverage that accumulated knowledge, in a manner similar to in-context learning or other multi-step learning (bulk of the training forms abstractions which are then used by the next stage). It's really unlikely we have some magical architecture that is fundamentally better than e.g. transformers or any other architecture at sample efficiency while having bad underlying data. My intuition is there might even be a hard limit to that. Multi-stage bootstrap might be the key, not the architecture.