Comment by wg0

2 years ago

Jaw dropping... so essentially DNNs also just "compress the information? is the take away here?

Why does this conclusion follow?

Of course similar text compresses more efficiently, but NNs don’t work with compressed (varying-size) representations, they work with vector representations which happen to be close in similarity space

  • they work with compressed representations, you take an arbitrary information with varying entropy into a fized size vector representation, that's a compression.

Well, yeah, but the training process means that the compression is both lossy and much less efficient than a standard compression method like gzip. You could even train your NN on its ability to losslessly recall, but we generally call that "overfitting" in the lingo.

  • The way you'd do compression using a NN, is using the NN to predict the probability of the next symbol, and feeding that into an arithmetic coder to produce a compressed representation. This process is lossless, and better prediction quality directly translates into better compression.

yes, biggest mindfuck is autoencoders. literally brute-force train a lossy compressor.