← Back to context

Comment by omoikane

3 days ago

Current leader of the Large Text Compression Benchmark is NNCP (compression using neural networks), also by Fabrice Bellard:

https://bellard.org/nncp/

Also, nncp-2024-06-05.tar.gz is just 1180969 bytes, unlike ts_zip-2024-03-02.tar.gz (159228453 bytes, which is bigger than uncompressed enwiki8).

Doesn't this fit the Hutter Prize conditions that is mentioned in other comment here https://news.ycombinator.com/item?id=46595109

  • It's too slow for that. The Hutter prize is CPU only so neural network solutions (which are the most interesting IMO) are effectively excluded. You need to generate 11 574 characters per second on the CPU only for decompression, and the compression time also counts and has to be below 24 hours in total.