Comment by srush
2 years ago
The claim of the paper is that you can store it losslessly! If you assume you have access to an LLM for free, then text is extremely compressible. Storing it in an embedding would be plenty of bits.
2 years ago
The claim of the paper is that you can store it losslessly! If you assume you have access to an LLM for free, then text is extremely compressible. Storing it in an embedding would be plenty of bits.
... which reminds me of recent research on using lossless compression (plus kNN) for text classification.
https://news.ycombinator.com/item?id=36707193