← Back to context

Comment by jamwise

3 hours ago

Reminds me of when I tried to use the library of babel as a data compression tool. It led me down a fun rabbit hole and was my first introduction to information theory.

The conclusion being that you basically need the same amount of data to represent the address of your data as the data itself, so it's not really effective at compression, just a fun thought experiment.

The cool part of this in modern times is that LLMs are basically a form of lossy compression that actually achieves the gist of what these tools fail at. Although it is lossy, and requires a massive substrate. This is related to the idea of AI/LLMs being a form of language compression.

The level of compression is pretty impressive when you think about it. I wrote a comment a while back which is still true (although bytes should be bits, so in that sense it’s still wrong): https://news.ycombinator.com/item?id=39559969

Back of the envelope calculation for storing valid 4-grams (sequences of four words) is around 10 billion x 14 bits per word = 17 gb for all 10 billion. There are LLMs 100x smaller which can write coherent prose.

3Blue1Brown just released a viduo about this Intelligence-Compression connection.

https://youtu.be/l6DKRf-fAAM

  • The idea was fresh in my mind because I watched this yesterday. Great video, the illustrations and intuition-building of the compressability of information was so good! I'm so grateful for 3Blue1Brown.

In some sense, science is the most extreme form of compression - Newtonian mechanics explains an incredible number of phenomena in a few lines of text.