Comment by yjftsjthsd-h
9 hours ago
Dumb question... Could you do a per-word bloom filter to do online spell checking without actually disclosing the words you're checking?
9 hours ago
Dumb question... Could you do a per-word bloom filter to do online spell checking without actually disclosing the words you're checking?
There are two scenarios I believe, first accidentally sending a (decent) password, and second the server not learning what you actually look up.
For the first case, sending a hash would prevent the server from learning a password that is not in the dictionary, something like password5 would hash to gibberish.
For the second, the server needs to know what to actually send back. I believe Google's malicious website check works (or used to) by truncating a hash an then just sending the answer for some 128 or so websites and have the browser figure out which of them the user wanted to visit. That creates some deniability over witch website you actually visited and should be also usable to prevent the server from learnering what you actually looked up.
So yes, I think you could design a more secure Protokoll. Though general security disclaimer the people trying to read your letters probably spend more time attacking than I spend writing this post.
a bloom filter look up is by hash, and given the relatively small set of words in english, it would be pretty easy for the server to reverse the hash sent to it. Thus a bloom filter wouldn't be very private.
Additionally, a typical spell checker feature is to provide alternative, correct, spellings, rather than just telling you whether a word is correctly spelled.
I bet there's some cool way to do this with zero-knowledge or homomorphic cryptography though!
There’s also a way simpler way: send a hash prefix to server, get a list of matches. Google Safe Browsing does this with URLs, for example.
haveibeenpwned.com does this for passwords too. I doubt you could make it work for all the smaller words though, let alone offer corrections.
You should be able to do a K-means type thing. Where your query is an entire group, and you grab the field from the chunk locally.
But you might still be able to use some frequency sampling to predict the words used, unless those chunks are very very carefully constructed.
> a bloom filter look up is by hash, and given the relatively small set of words in english, it would be pretty easy for the server to reverse the hash sent to it. Thus a bloom filter wouldn't be very private.
The typical use of a Bloom filter is to have it locally as a prefilter, not to send hashes to the server.
> I bet there's some cool way to do this with zero-knowledge or homomorphic cryptography though!
The code for which would almost certainly be larger than a fully local dictionary for any human language.
> a typical spell checker feature is to provide alternative, correct, spellings, rather than just telling you whether a word is correctly spelled.
I personally don't use that one, for me the red underline is enough.
Just want to mention that the feature in question here is for translation, not spell checking.