Comment by blintz
3 years ago
Indeed! The encryptions of the client queries are fully semantically secure - under relatively solid lattice-based cryptographic assumptions, a server would need to do more than 2^128 work to recover the plaintext of someone’s query. One query for one item in the database is indistinguishable (without the client’s key) from another query for the same item later; in other words, it’s similar to something like the guarantee of CBC or GCM modes, where as long as you use it correctly, it is secure even if the attacker can see many encryptions of its choosing.
> without the client’s key
Thank you, these 4 words really helped with my understanding so I'm calling it out incase it helps others. So I was thinking, what prevents you from replaying the query and getting the same page back? But it seems the answer is: that would only produce a gibberish response because you don't have the key.
How indistinguishable is a Not Found result by the server? It seems like user behavior would be to re-request the article a second time, so the client should probably protect the user against this kind of server (which could bisect on article popularity to find the requested article in ~20 tries) by throwing up a warning about an article in the index not being retrievable.
Indeed, client behavior when an article fails to be retrieved is very important. A malicious server could guess what article is being retrieved, remove it from the database, and see if the client 'reacts' in some observable way.
The mitigations for this are somewhat heuristic; the client should perhaps 'pace'/'debounce' subsequent requests when a retrieval fails. For example, enforce that the re-retrieval will happen at a uniformly random time between 1-5 seconds later.
It's good to note that this kind of attack is somewhat unavoidable, since denial-of-service is (generally) always possible. An interesting mitigation might be for the server to produce a ZKP that the full dot product was computed correctly, and have the client always check the ZKP before displaying the article. This is mostly a theoretical fix for now I believe, since ZKP's over homomorphic computation are in the realm of theory more than practice.
If the protocol is that the server returns a specific "index N deleted" result for deleted N then the client can at least trust a valid response from the server as opposed to a DDoS or unmasking attempt. Any server that returns something other than a valid document or "N deleted" should no longer be trusted or communicated with, but retries for communication failures or timeouts should still be safe.
Edit: this assumes that the client gets a trusted index from a set of trusted servers who are at least as up to date as the latest index that is made available, which timestamped signatures or similar should suffice for. Or the client can supply the timestamped index signature and the server can reply with a different message if it's too old.
I think their model doesn't allow you to express such a thing? At implementation level the client requests an id in a fixed range and gets back a 100KB chunk corresponding to that id; if there's no article for that id then it'll just get a chunk of all 0s, and to anyone without the client's key that's indistinguishable from a chunk with content in it.
Would this be vulnerable to a side-channel attack as follows?
1. Record what item was retrieved from disk for a query
2. Run a dictionary through the query system, and see which item matches the record
The server processing has to scan the entire database to answer a query- so the entire database stays in memory, and access patterns tell you nothing about the query.
Notably to some who may misinterpret the "stays in memory" aspect: even the copy in memory is fully scanned by every query. So it's not just disk access patterns that don't give anything away, but RAM access patterns.