Comment by nl
4 years ago
It's actually the compression that forces it to learn higher level concepts.
In your stop sign example, say we are trying to teach a visual model the difference between toy stop signs and real stop signs.
To train it you feed it a 3D model of the world and the actions a person takes in response (ie, ignoring toy stop signs but stopping for real ones). Once the embedding is well trained (with lots of data) if you then run it through something like UMAP to reduce the number of dimensions in the embedding from hundreds to 2 or 3 you'll see it has "discovered" the concept of "scale" - all the small toy stop signs will be clustered together and the real ones clustered elsewhere.
That generalisation forced by compression is where the abstraction of "scale" comes from.
(Of course in real life you'd use a more complex model than just an embedding for this, but in principle this is the idea).
No comments yet
Contribute on Hacker News ↗