Comment by bo1024

12 years ago

> What matters is that large cloud systems are fundamentally incapable of protecting data.

I don't believe that's true.

1. Google (and others?) is already aggressively increasing the amount of encryption it does on traffic between its datacenters. So they have been addressing this problem before it was even brought to light.

2. We easily have the encryption abilities to do many more things than we do with secure cloud data; we'd just have to pay more for it. For instance, I can encrypt everything the minute it leaves my laptop, store it in the cloud, and not decrypt until it hits my laptop again. Nobody but me ever gets the secret key (heck, it could be a one-time-pad and thus unbreakable). If I trust the cloud computers themselves, then I can store different secret keys on each and use strong public-key encryption to protect all traffic between different machines in the cloud, and between my machine. Breaking the system requires compromising a machine, and even then you only get the key for that machine.

3. In theory, fully homomorphic encryption could allow the best of both worlds above. I completely encrypt my data on my machine --- nobody else has the key --- then send it into the cloud where cloud companies can do operations for me like searching, sorting, filtering, etc, all without ever decrypting the data or learning what it is. They send me back the results (securely), then I decrypt. Of course, right now this would be massively slow and expensive, but progress is being made.

Naturally all of the above are subject to the "5-dollar wrench" rule or the "secret court/FISA/warrants" rule. You cannot protect your data from the people making a law that says "give up your data". But it is technologically possible and even feasible to secure data from the NSA's snooping. The tradeoff is cost and time.

If you would use one-time-pad before storing to the cloud you'd either need to store the very same pad on the cloud, then effectively not needing encryption, or you wouldn't need the cloud, as the amount of the encrypted data would match the amount of the pad data one to one.

And homomorphic encryption is still far from being practical.

  • Sure, one would more realistically use any standard encryption scheme. Agreed on homomorphic encryption as mentioned in my previous comment. But "impractical" is a far cry from "fundamentally impossible".

>In theory, fully homomorphic encryption could allow the best of both worlds above. I completely encrypt my data on my machine --- nobody else has the key --- then send it into the cloud where cloud companies can do operations for me like searching, sorting, filtering, etc, all without ever decrypting the data or learning what it is.

Can you explain this to me? I don't understand how you can search encrypted data.

  • It's mind-boggling, but possible. Here's the wikipedia link: http://en.wikipedia.org/wiki/Homomorphic_encryption

    The idea is this: I encrypt my data and give it to the cloud. I also encrypt the algorithm I want the cloud to use. In this case, it could be a search algorithm with the search query hardcoded. Right now, it would have to be encoded as a circuit and then encrpyted from there into a different circuit.

    The cloud runs my encrypted data through this "transformed" circuit, yielding some encrypted output. The cloud tells me the output. I then decrypt it with my original key.

    It's crazy that this works (longstanding open problem solved in 2005 or 06 I think). The name "homomorphic" comes from functions f, like homomorphisms, in which "order doesn't matter":

        f(data) = f(Decrypt(Encrypt(data))) = Decrypt(f(Encrypt(data))).
    

    Hope that makes some sense.

  • You don't need full homomorphic encryption to do encrypted search, look up PKES systems, there's tons of papers on it now (http://crypto.stanford.edu/~dabo/abstracts/encsearch.html). It's possible to encrypted keyword search with trapdoor functions in such a way that the server can't learn anything about what you're searching on, nor what is stored.

    • Are you aware of any vaguely practical systems for a variant of keyword search that just returns whether the keyword was found (e.g. 1 for found, 0 for not found) but with the added requirement that the result must itself be encrypted? I suspect it degenerates to fully homomorphic encryption though.

      1 reply →