← Back to context

Comment by saalweachter

5 years ago

I wonder what the largest checksummed file you could manually (well, programmatically) test every single bitflip against, rechecksumming to see if you found the bad bit, in a reasonable amount of time is.

1 kilobit of data seems pretty doable, only a thousand variants to check. Somewhere around 1 gigabit seems like it should be too much for consumer hardware -- if it takes 1 second to re-sum the data, that's ~30 years to test every bit flip.

This is the sad NP-hard part of cryptographically-strong checksumming - the heavier the algorithm, the longer it'll take, and the harder it'll be to exhaust it on even trivial inputs.

Sometimes simply loading both copies of the data into RAM and comparing them can be faster - for scenarios where you have both copies available.

XXHash comes to mind as the (apparently) fastest non-cryptographic general-purpose hash out there at the moment; it runs close to memory speed. It would be interesting to see a microbenchmark-style graph showing the slowdown checking all permutations of 1KB, 1MB, etc.