Comment by philipkglass
3 years ago
Don't you have to pad the output ciphertext size to match the largest article you could possibly request from the set of articles? Or is a fixed-size output an inherent property of homomorphic encryption schemes? Otherwise it seems to reveal something to the server just by the size of the ciphertext (since WP articles vary in size).
The output is always going to be fixed size.
The nicer way to look at homomorphic encryption is like math, but the more literal way is like a bunch of logic gates. There are N output wires. You control the bits on the input wires. Nothing you do will change the number of wires.
Yes, every item needs to be the same size. We batch the articles into 100KB chunks, and we split articles larger than this into multiple chunks.
Wouldn't you then have to send out multiple ciphertexts (for articles >100 KB)? Which would leak something about the size of the article...
You would. It’s important for the client to pace it’s requests in a way that does not reveal too much (for example, the client should not just request both chunks of the article at the same time). The best thing to do would probably be to have a ‘load more’ button at the bottom of a very long article that makes a separate request.
If you think about it, the pacing of user queries could always reveal something (if there’s a long gap between quarries, perhaps it’s a dense mathematics article?). So the best we can hope for is pacing requests either randomly, or for even further protection, perhaps making dummy requests on a fixed cadence.