Comment by Someone
7 hours ago
> I’ve used LEB128 (with canonicalisation) extensively and... this looks so much nicer for most use-cases (length prefixed, supports the full uint64 range without that extra 10th byte)
If you only want to encode uint64 numbers LEB128 could easily be tweaked to fit in 9 bytes in several ways:
- using the offset trick described in this article would remove non-unique encodings (0x80 0x00 would encode 128)
- never allowing encodings longer than 9 bytes would mean the MSB of any ninth byte would always be zero, so you could reuse that, and store 8 bits in any ninth byte, for a total of 7 bits in each of the first eight bytes plus 8 in the ninth = 64
Both tweaks would lose LEB128’s property that you can find where each number starts from any byte in the stream, but the encoding discussed here doesn’t have that property either.
No comments yet
Contribute on Hacker News ↗