← Back to context

Comment by cyberax

2 days ago

Small block ciphers are great for some use-cases!

32-bit block ciphers are a good way to create short opaque IDs because they provide a bijection between two sets of integers. And even if your ID is slightly shorter than 32-bit you can easily shave off a few bits with cycle walking: https://en.wikipedia.org/wiki/Format-preserving_encryption#F... E.g. if you want to make sure your IDs can be mapped into 31/63 bits.

I especially like the RC-5 cipher for these kinds of uses. It can be implemented in just a few lines of code and there are standard test vectors for it.

The RC-5 cipher was very nice for its day, but I am certain that it is much slower than AES on any modern CPU, with the exception of microcontrollers, where nonetheless other solutions, e.g. ChaCha20, may be faster.

AES also needs only a handful of lines of code for its implementation (using assembly). For such an application, you can even reduce the number of rounds of AES-128, e.g. from 10 to 4.

When you want truly uniform random numbers, then encrypting with AES-128, then truncating, is best. If you want invertible encryption, then you should encrypt a counter and either use a 32-bit addition or a 32-bit XOR for encrypting the 32-bit number. With a single AES-128 invocation for generating a random mask, you can encrypt four 32-bit numbers.

Of course, when speed does not matter, you can use pretty much any of the historical block ciphers, because the security requirements for encrypting 32-bit numbers are very low, since they are easier to find by brute force searching than by attempting to break any kind of encryption.

  • The problem is that AES needs a 128-bit block.

    Imagine that you want to obfuscate your order numbers in the database so that customers can't infer the volume of business by checking the order number.

    You can use UUIDs, but you also want to keep the numbers short so they can be dictated over the phone. You can use random IDs, but then you need to lock them in the database during the object creation otherwise you might get collisions.

    Or perhaps you have a distributed system and want to allocate a bunch of IDs for each node in advance.

    RC-5 with its small block size neatly solves this. You can have a strictly increasing database sequence, and then just give out each node a bunch of numbers ("from 100 to 120"), and nodes can then generate IDs based on them. And there can be no collisions with other nodes because symmetric ciphers are obviously bijective.

    For these kinds of use-cases performance is not an issue.

    • Standard AES needs an 128-bit block.

      However, the AES mixing function is made from a 32-bit mixing function that is extended to blocks whose lengths are multiples of 32-bit by composing it with a byte permutation.

      The standard byte permutation extends the block size from 32-bit to 128-bit, but with the current AES instructions of CPUs you can either cancel the byte permutation to get a 32-bit block size or you can replace it with a non-standard byte permutation, to get any block size that is a multiple of 32-bit.

      If you cancel the byte permutation, four 32-bit words are encrypted independently, instead of one 128-bit block. This doubles the number of instructions, but this does not necessarily reduce the throughput, as the instructions may be executed concurrently, so the throughput measured in encrypted numbers will increase by a number between 2 and 4, depending on the CPU.

Funny your example is rc5, I wrote exactly what you describe to generate 32-bit cookies in a random prototype a few years ago: https://github.com/jcalvinowens/sdvr/blob/main/rc5.c

It is cute, but surely there's a more efficient way than RC5? There are bijective hash functions which are much cheaper (murmur, at least).

  • In my case, performance was utterly unimportant.

    But is Murmur actually bijective?

    • Mine too, I was just curious.

      I recall empirically determining murmur was bijective across all 32-bit inputs, but I can't find that written down anywhere.