← Back to context

Comment by lloeki

1 month ago

> they really could have explained that better. The language used in Apple's article

This one?

https://machinelearning.apple.com/research/homomorphic-encry...

I find that description perfectly clear for someone who doesn't already know what homomorphic encryption means:

> One of the key technologies we use to do this is homomorphic encryption (HE), a form of cryptography that enables computation on encrypted data (see Figure 1). HE is designed so that a client device encrypts a query before sending it to a server, and the server operates on the encrypted query and generates an encrypted response, which the client then decrypts. The server does not decrypt the original request or even have access to the decryption key, so HE is designed to keep the client query private throughout the process.

Later:

> HE excels in settings where a client needs to look up information on a server while keeping the lookup computation encrypted.

And there's more perfectly layman-understandable in PIR and PNSS sections, complete with real-life examples and simple diagrams.

One just has to read the goddamn thing, which apparently is an insurmountably tall order these days for content that is more than 250 characters.

I read that. It doesn't actually explain how the server can tell me the Eiffel Tower is in my photo without knowing it's telling me that. It glosses over the mechanism by which it can tell me what's in my photo without the service itself knowing. Yeah, cool, never-decrypted. So how do they match up? An ultra-abstract "A[encrypted] + B[encrypted] = A+B[encrypted]" doesn't tell me anything.

As an aside, your unfounded derision about "RTFA" is unwarranted and unnecessary. That was an unusually-hostile response for like, just talking about things.

  • The server doesn't tell you (and doesn't know) that it's the eiffel tower. Only when the result comes back to your device, is your device able to identify it as such.

  • I'll take it as if you don't know what homomorphic encryption means.

    caveat: this is not rigorous math, just a rough presentation for one to wrap their head around the keystone principles of it

    Say you have a space like natural numbers N and a computation function f: N->R. I'm just using R - as in result - so that it's not confusing below:

    Take a n in N, f(n) = r, r in R

    That's nice, we compute something, but the "computation place" is going to know both n and r.

    Now imagine you have an encryption function g that maps every n in N to another integer. I'll just mark encrypted things with ' e.g the encrypted space is N' for easier trackability, so g: N->N' is encryption and its inverse g': N'->N is decryption.

    So take a n in N, g(n) = n' in N', and g'(n') = n back again.

        N            N'
        
        n --- g ---> n'
        
        n <-- g' --- n'
    

    That's basic encryption.

    Now back to making f useful: the idea is to have an encryption scheme that allows performing a certain set of operations on the encrypted space and get an encrypted result that results in an identical result once decrypted. From these operations we can have a f' function so that:

    for all n in N, f(n) = g'(f'(g(n)))

    Visually:

        N           N'            R'            R
        n --- f --> n' --- g' --> r' --- f' --> r
    

    The whole point is that cheap f and f' happen locally and intensive g' can happen in untrusted places, because all that is seen is an encrypted input n' and encrypted result r'. Only thing it can see is at best that there's A match but not what THE match is, because that all just looks like random numbers; even the no-match case could very well be encrypted and in that case it doesn't even see that.

    That is the principle of "homomorphic encryption", and it carries the intrinsic property of the computation place not knowing a single thing about neither the input or the output.

    > It doesn't actually explain how the server can tell me the Eiffel Tower is in my photo without knowing it's telling me that.

    So it depends on what you mean by "explain how":

    - "homomorphic encryption" is the high level answer, described and provided in the Apple article

    - if that is not a sufficiently detailed answer then you gotta roll your sleeves up and dive much deeper: the swift-homomorphic-encryption implementation, the BFV paper, the Wally paper, and the other bits that help prevent accidental side channel leaking are all both superficially discussed in a synthetic way and deep-linked in the Apple article in "would you like to know more" fashion.

    Or maybe you wanted to know how they performed the actual match (g') in the encrypted space?

    > With PNNS, the client encrypts a vector embedding and sends the resulting ciphertext as a query to the server. The server performs HE computation to conduct a nearest neighbor search and sends the resulting encrypted values back to the requesting device, which decrypts to learn the nearest neighbor to its query embedding

    Which is possible because they

    > have implemented the Brakerski-Fan-Vercauteren (BFV) HE scheme, which supports homomorphic operations that are well suited for computation (such as dot products or cosine similarity) on embedding vectors that are common to ML workflows.

    My apologies for the harsh words, they were not meant as an ad hominem, more like a general ranty reflection on the state of the web society.