Comment by shawnz

1 month ago

I don't agree that this is end to end encrypted. For example, a compromise of the TEE would mean your data is exposed. In a truly end to end encrypted system, I wouldn't expect a server side compromise to be able to expose my data.

This is similar to the weasely language Google is now using with the Magic Cue feature ever since Android 16 QPR 1. When it launched, it was local only -- now it's local and in the cloud "with attestation". I don't like this trend and I don't think I'll be using such products

29 comments

shawnz

liuliu 1 month ago

I agree it is more like e2teee, but I think there is really no alternative beyond TEE + anonymization. Privacy people want it locally, but it is 5 to 10 years away (or never, if the current economics works, there is no need to reverse the trend).

shawnz 1 month ago
There's FHE, but that's probably an even more difficult technical challenge than doing everything locally
- liuliu 1 month ago
  
  FHE is impossible. You cannot expect to compete on 100x more cost for the same service you provide (and there is no design for accelerated hardware (Tensor Core) on FHE).
  
  2 replies →
- gardnr 1 month ago
  
  FHE would be ideal. Relevant conversation from 6 months ago:
  https://news.ycombinator.com/item?id=44601023
ignoramous 1 month ago
> ... 5 to 10 years away (or never, if the current economics works...
Think PCs in 5y to 10y that can run SoTA multi-modal LLMs (cf Mac Pro) will cost as much as cars do, and I reckon folks will buy it.
- binary132 1 month ago
  
  ISTM that most people would rather give away their privacy than pay even a single cent for most things.

2bitencryption 1 month ago

if (big if) you trust the execution environment, which is apparently auditable, and if (big if) you trust the TEE merkle hash used to sign the response is computer based on the TEE as claimed (and not a malicious actor spoofing a TEE that lives within an evil environment) and also if you trust the inference engine (vllm / sglanf, what have you) then I guess you can be confident the system is private.

Lots of ifs there, though. I do trust Moxie in terms of execution though. Doesn’t seem like the type of person to take half measures.

mosura 23 days ago

> if (big if) you trust the execution environment, which is apparently auditable
This is the key question.
What makes it so strange is such an execution environment would have clear applications outside of AI usage.

derefr 1 month ago

"Server-side" is a bit of a misnomer here.

Sure, for e.g. E2E email, the expectation is that all the computation occurs on the client, and the server is a dumb store of opaque encrypted stuff.

In a traditional E2E chat app, on the other hand, you've still got a backend service acting as a dumb pipe, that shouldn't have the keys to decrypt traffic flowing through it; but you've also got multiple clients — not just your own that share your keybag, but the clients of other users you're communicating with. "E2E" in the context of a chat app, means "messages are encrypted within your client; messages can then only be decrypted within the destination client(s) [i.e. the client(s) of the user(s) in the message thread with you.]"

"E2E AI chat" would be E2E chat, with an LLM. The LLM is the other user in the chat thread with you; and this other user has its own distinct set of devices that it must interact through (because those devices are within the security boundary of its inference infrastructure.) So messages must decrypt on the LLM's side for it to read and reply to, just as they must decrypt on another human user's side for them to read and reply to. The LLM isn't the backend here; the chat servers acting as a "pipe" are the backend, while the LLM is on the same level of the network diagram as the user is.

Let's consider the trivial version of an "E2E AI chat" design, where you physically control and possess the inference infrastructure. The LLM infra is e.g. your home workstation with some beefy GPUs in it. In this version, you can just run Signal on the same workstation, and connect it to the locally-running inference model as an MCP server. Then all your other devices gain the ability to "E2E AI chat" with the agent that resides in your workstation.

The design question, being addressed by Moxie here, is what happens in the non-trivial case, when you aren't in physical possession of any inference infrastructure.

Which is obviously the applicable case to solve for most people, 100% of the time, since most people don't own and won't ever own fancy GPU workstations.

But, perhaps more interesting for us tech-heads that do consider buying such hardware, and would like to solve problems by designing architectures that make use of it... the same design question still pertains, at least somewhat, even when you do "own" the infra; just as long as you aren't in 100% continuous physical possession of it.

You would still want attestation (and whatever else is required here) even for an agent installed on your home workstation, so long as you're planning to ever communicate with it through your little chat gateway when you're not at home. (Which, I mean... why else would you bother with setting up an "E2E AI chat" in the first place, if not to be able to do that?)

Consider: your local flavor of state spooks could wait for you to leave your house; slip in and install a rootkit that directly reads from the inference backend's memory; and then disappear into the night before you get home. And, no matter how highly you presume your abilities to detect that your home has been intruded into / your computer has been modified / etc once you have physical access to those things again... you'd still want to be able to detect a compromise of your machine even before you get home, so that you'll know to avoid speaking to your agent (and thereby the nearby wiretap van) until then.

wutinthewut 25 days ago

Agree. Products and services in the privacy space have a tendency to be incredibly misleading in their phrasing, framing, and overall marketing as to the nature of their assertions that sound pretty much like: "we totally can never ever see your messages, completely and utterly impossible". Proton is particularly bad for this, it's rather unfortunate to see this from "Moxie" as well.

It's like, come on you know exactly what you're doing, it's unambiguous how people will interpret this, so just stop it. Cue everyone arguing over the minutiae while hardly anyone points out how troubling it is that these people/entities have no concerns with being so misleading/dishonest...

gravifer 11 days ago

I asked the model about its capabilities, and it turns out it indeed can do Web searches; if it's not hallucinating, the backend server indeed decrypts the output of the LLM; only the user prompt is E2EEed against the server
Edit: I'm a little weary to find there is convenient import but not export functionalities. I manually copied the conversation into a markdown file <https://gist.github.com/Gravifer/1051580562150ce7751146be0c9...>

Stefan-H 1 month ago

Just like your mobile device is one end of the end-to-end encryption, the TEE is the other end. If properly implemented, the TEE would measure all software and ensure that there are no side channels that the sensitive data could be read from.

paxys 1 month ago
By that logic SSL/TLS is also end-to-end encryption, except it isn't
- Stefan-H 1 month ago
  
  When the server is the final recipient of a message sent over TLS, then yes, that is end-to-end encryption (for instance if a load balancer is not decrypting traffic in the middle). If the message's final recipient is a third party, then you are correct, an additional layer of encryption would be necessary. The TEE is the execution environment that needs access to the decrypted data to process the AI operations, therefore it is one end of the end-to-end encryption.
  
  12 replies →