Comment by scosman

4 months ago

"I don't understand most of the technical details of Apple's blog post"

I do:

- Client side vectorization: the photo is processed locally, preparing a non-reversible vector representation before sending (think semantic hash).

- Differential privacy: a decent amount of noise is added the the vector before sending it. Enough to make it impossible to reverse lookup the vector. The noise level here is ε = 0.8, which is quite good privacy.

- OHTTP relay: it's sent through a 3rd party so Apple never knows your IP address. The contents are encrypted so the 3rd party never doesn't learn anything either (some risk of exposing "IP X is an apple photos user", but nothing about the content of the library).

- Homomorphic encryption: The lookup work is performed on server with encrypted data. Apple can't decrypt the vector contents, or response contents. Only the client can decrypt the result of the lookup.

This is what a good privacy story looks like. Multiple levels of privacy security, when any one of the latter 3 should be enough alone to protect privacy.

"It ought to be up to the individual user to decide their own tolerance for the risk of privacy violations." -> The author themselves looks to be an Apple security researcher, and are saying they can't make an informed choice here.

I'm not sure what the right call is here. But the conclusion "Thus, the only way to guarantee computing privacy is to not send data off the device." isn't true. There are other tools to provide privacy (DP, homomorphic encryption), while also using services. They are immensely complicated, and user's can't realistically evaluate risk. But if you want features that require larger-than-disk datasets, or frequently changing content, you need tools like this.

I appreciate the explanation. However, I think you do not address the main problem, which is that my data is being sent off my device by default and without any (reasonable) notice. Many users may agree to such a feature (as you say, it may be very secure), but to assume that everyone ought to be opted in by default is the issue.

  • I'm not sure I agree -- asking users about every single minor feature is (a) incredibly annoying, and (b) quickly causes request-blindness in even reasonably security-conscious users. So restraining the nagging for only risky or particularly invasive things makes sense to me.

    Maybe they should lump its default state into something that already exists? E.g. assume that if you already have location access enabled for Photos (it does ask!), you've already indicated that you're okay with something about this identifying being sent to Apple whenever you take a picture.

    My understanding is that Location Services will, among other things, send a hash of local WiFi network SSIDs and signal strengths to a database Apple maintains, and use that to triangulate a possible position for you. This seems loosely analogous to what's going on here with the compute-a-vector thing.

    • > Maybe they should lump its default state into something that already exists?

      It could be tied to iCloud Photos, perhaps, because then you already know that your photos are getting uploaded to Apple.

      10 replies →

    • > asking users about every single minor feature

      Then perhaps the system is of poor design and needs further work before being unleashed on users…

    • Especially for a company which heavily markets about how privacy-focused it is,

      1)sending my personal data to them in any way is not a "feature." It's especially not a feature because what it sets out to do is rather unnecessary because every photo has geotagging, time-based grouping, and AI/ML/whatever on-device keyword assignments and OCR. I can open up my phone right now and search for every picture that has grass in it. I can search for "washington" and if I took a picture of a statue of george washington that shows the plaque, my iPhone already OCR'd that and will show the photo.

      2)"minor" is not how I would ever describe sending data based off my photos to them, regardless of how much it's been stuffed through a mathematical meat grinder.

      3)Apple is usually very upfront about this sort of thing, and also loves to mention the most minor, insignificant, who-gives-a-fuck feature addition in the changenotes for "point" system updates. We're talking things like "Numbers now supports setting font size in chart legends" (I'm making that up but you get the point.)

      This was very clearly an "ask for forgiveness because the data we want is absolutely priceless and we'll get lots of it by the time people notice / word gets out." It's along the lines of Niantic using the massive trove of photos from the pokemon games to create 3d maps of everywhere.

      I specifically use iOS because I value my privacy (and don't want my cell phone data plan, battery power, etc to be a data collection device for Google.) Sending data based off my photos is a hard, do-not-pass-go-fuck-off-and-die line in the sand for me.

      It's especially shitty because they've gated a huge amount of their AI shit behind owning the current iPhone model....but apparently my several generation old iPhone is more than good enough to do some AI analysis on all my photos, to upload data for them?

      Fuck everyone Apple who was involved in this.

      3 replies →

  • I think it does address the main problem. What he is saying is that multiple layers of security is used to ensure (mathematically and theoretically proved) that there is no risk in sending the data, because it is encrypted and sent is such a way that apple or any third party will never be able to read/access it (again, based on theoretically provable math) . If there is no risk there is no harm, and then there is a different need for ‘by default’, opt in/out, notifications etc.

    The problem with this feature is that we cannot verify that Apple’s implementation of the math is correct and without security flaws. Everyone knows there is security flaws in all software, and this implementation is not open (I.e. we cannot review the code, and even if we could review code we cannot verify that the provided code was the code used in the iOS build). So, we have to trust Apple did not make any mistakes in their implementation.

    • Your second paragraph is exactly the point made in the article as the reason why it should be an informed choice and not something on by default.

      3 replies →

    • As someone with a background in mathematics I appreciate your point about cryptography. That said, there is no guarantee that any particular implementation of a secure theoretical algorithm is actually secure.

      20 replies →

    • Hypothetical scenario: Theo de Raadt and Bruce Schneier are hired to bring Apple products up to their security standards. They are given a public blog, and they are not required to sign an NDA. They fix every last vulnerability in the architecture. Vladimir Putin can buy MacBooks for himself and his generals in Moscow, enable Advanced Data Protection, and collaborate on war plans in total confidence.

      Where are the boundaries in this scenario?

      4 replies →

    • Except for the fact (?) that quantum computers will break this encryption so if you wanted to you could horde the data and just wait a few years and then decrypt?

      3 replies →

  • I’m a cryptographer and I just learned about this feature today while I’m on a holiday vacation with my family. I would have loved the chance to read about the architecture, think hard about how much leakage there is in this scheme, but I only learned about it in time to see that it had already been activated on my device. Coincidentally on a vacation where I’ve just taken about 400 photos of recognizable locations.

    This is not how you launch a privacy-preserving product if your intentions are good, this is how you slip something under the radar while everyone is distracted.

    • In engineering we distinguish the "how" of verification from the "why" of validation; it looks like much comments disagreement in this post is about the premise of whether ANY outgoing data counts as a privacy consent issue. It's not a technical issue, it's a premises disagreement issue and that can be hard to explain to the other side.

      2 replies →

    • To play Apple's advocate, this system will probably never be perfect, and stand up to full scrutinity from everyone on the planet. And they also need the most people possible activated as it's an adverserial feature.

      The choice probably looks to them like:

        A - play the game, give everyone a heads up, respond to all feedback, and never ship the feature
      
       B - YOLO it, weather the storm, have people forget about it after the holiday, and go on with their life.
      

      Wether B works is up to debate, but that was probably their only chance to have it ship from their POV.

      4 replies →

  • I think I'm saying: you're not sending "your data" off device. You are sending a homomorphically encrypted locally differentially private vector (through an anonymous proxy). No consumer can really understand what that means, what the risks are, and how it would compare to the risk of sending someone like Facebook/Google raw data.

    I'm asking: what does an opt in for that really look like? You're not going to be able to give the user enough info to make an educated decision. There's ton of risk of "privacy washing" ("we use DP" but at very poor epsilon, or "we use E2E encryption" with side channel data gathering).

    There's no easy answer. "ask the user", when the question requires a phd level understanding of stats to evaluate the risk isn't a great answer. But I don't have another one.

    • In response your second question, opt in would look exactly like this: don't have the box checked by default, with an option to enable it: "use this to improve local search, we will create an encrypted index of your data to send securely to our servers, etc..." A PhD is not necessary to understand the distinction between storing data locally on a machine vs. on the internet.

      28 replies →

    • There is significant middle ground between "do it without asking" and "ask about every single thing". A reasonable option would be "ask if the device can send anonymized data to Apple to enable such and such features". This setting can apply to this specific case, as well as other similar cases for other apps.

    • If you can't meaningfully explain what you're doing then you can't obtain informed consent. If you can't obtain informed consent then that's not a sign to go ahead anyway, it's a sign that you shouldn't do it.

      This isn't rocket surgery.

      2 replies →

    • I don't care if all they collect is the bottom right pixel of the image and blur it up before sending it, the sending part is the problem. I don't want anything sent from MY device without my consent, whether it's plaintext or quantum proof.

      You're presenting it as if you have to explain elliptic curve cryptography in order to toggle a "show password" dialogue but that's disingenuous framing, all you have to say is "Allow Apple to process your images", simple as that. Otherwise you can argue many things can't possibly be made into options. Should location data always be sent, because satellites are complicated and hard to explain? Should we let them choose whether they can turn wifi on or off, because you have to explain IEEE 802.11 to them?

      18 replies →

  • Notice is always good and Apple should implement notice.

    However, "my data is being sent off my device" is incorrect, as GP explained. Metadata, derived from your data, with noise added to make it irreversible, is being sent off your device. It's the equivalent of sending an MD5 of your password somewhere; you may still object, but it is not factually correct to say your password was transmitted.

    • > However, "my data is being sent off my device" is incorrect, as GP explained. Metadata, derived from your data, with noise added to make it irreversible, is being sent off your device.

      Sounds like my data is being sent off my device.

      > It's the equivalent of sending an MD5 of your password somewhere

      Sounds even worse lol

      2 replies →

    • If the information being sent from my advice cannot be derived from anything other than my own data then it is my data. I don't care what pretty dress you put on it.

    • > It's the equivalent of sending an MD5 of your password somewhere

      a) MD5 is reversible, it just cost GPU time to brute force

      b) It is unproven that their implementation is irreversible

      2 replies →

    • Well that's what you're told is happening. As it's all proprietary closed source software that you can't inspect or look at or verify in any manner, you have absolutely zero evidence whether that's what's actually happening or not.

      1 reply →

  • "Your data" is not actually being sent off your device, actually, it is being scrambled into completely unusable form for anyone except you.

    This is a much greater level of security than what you would expect from a bank, for example, who needs to fully decrypt the data you send it. When using your banking apps over HTTPS (TLS), you are trusting the CA infrastructure, you are trusting all sorts of things. You have fewer points of failure when a key for homomorphic encryption resides only on your device.

    "Opting-in by default" is therefore not unsafe.

  • I guess it depends on what you're calling "your data" -- without being able to reconstruct an image from a noised vector, can we say that that vector in any way represents "your data"? The way the process works, Apple makes their own data that leaves your device, but the photo never does.

    • It's the same as the CSAM initiative. It doesn't matter what they say they send, you cannot trust them to send what they say they send or trust them not to change it in the future.

      Anything that leaves my devices should do so with my opt-IN permission.

      1 reply →

  • How would you explain client side vectorization, differential privacy and homomorphic encryption to a layman in a single privacy popup so that they can make an informed choice?

    Or is it better to just trust that mathematics works and thus encryption is a viable way to preserve privacy and skip the dialog?

  • Do you consider your data to include non-reversible hashes of your data injected with random noise? I'm not sure I consider that my data. Its also not even really meta-data about my data.

  • Do you use iCloud to store your photos?

    • I’m not the person you asked, but I agree with them. To answer your question: No, I do not use iCloud to store my photos. Even if I did, consent to store data is not the same as consent to scan or run checks on it. For a company whose messaging is all about user consent and privacy, that matters.

      This would be easily solvable: On first run show a window with:

      > Hey, we have this new cool feature that does X and is totally private because of Y [link to Learn More]

      > Do you want to turn it on? You can change your mind later in Settings

      > [Yes] [No]

      37 replies →

  • When your phone sends out a ping to search for cellular towers, real estate brokers collect all that information to track everywhere you go and which stores you visit.

    Owning a phone is a privacy failure by default in the United States.

    • > When your phone sends out a ping to search for cellular towers, real estate brokers collect all that

      Care to provide a pointer to what device they are using? I would absolutely get my real estate license for this.

    • You are being downvoted because you're so painfully correct. It's not an issue exclusive to the United States, but American intelligence leads the field far-and-away on both legal and extralegal surveillance. The compliance forced by US Government agencies certainly helps make data tracking inescapable for the average American.

      Unfortunately, the knee-jerk reaction of many defense industry pundits (and VCs, for that matter) is that US intelligence is an unparalleled moral good, and the virtues of privacy aren't worth hamstringing our government's work. Many of these people will try to suppress comments like yours because it embarrasses Americans and American business by association. And I sympathize completely - I'm dumbfounded by the response from my government now that we know China is hacking our telecom records.

      1 reply →

> This is what a good privacy story looks like.

What a good privacy story looks like is that my photos aren’t sent anywhere in any way shape or form without explicit opt in permission.

You're presenting a false dichotomy between "perfect user understanding" and "no user choice." The issue isn't whether users can fully comprehend homomorphic encryption or differential privacy – it's about basic consent and transparency.

Consider these points:

1. Users don't need a PhD to understand "This feature will send data about your photos to Apple's servers to enable better search."

2. The complexity of the privacy protections doesn't justify removing user choice. By that logic, we should never ask users about any technical feature.

3. Many privacy-conscious users follow a simple principle: they want control over what leaves their device, regardless of how it's protected.

The "it's too complex to explain" argument could justify any privacy-invasive default. Would you apply the same logic to, say, enabling location services by default because explaining GPS technology is too complex?

The real solution is simple: explain the feature in plain language, highlight the benefits, outline the privacy protections, and let users make their own choice. Apple already does this for many other features. "Default off with opt-in" is a core principle of privacy-respecting design, regardless of how robust the underlying protections are.

  • I don't believe I said or implied that anywhere: 'You're presenting a false dichotomy between "perfect user understanding" and "no user choice."'? Happy to be corrected if wrong.

    Closest I come to presenting an opinion on the right way UX was "I'm not sure what the right call is here.". The thing I disagreed with was a technical statement "the only way to guarantee computing privacy is to not send data off the device.".

    Privacy respecting design and tech is a passion of mine. I'm pointing out "user choice" gets hard as the techniques used for privacy exceed the understanding of users. Users can intuitively understand "send my location to Google [once/always]" without understanding GPS satellites. User's can't understand the difference between "send my photo" and "send homomorphicly encrypted locally differentially private vector of e=0.8" and "send differentially private vector of e=50". Your prompt "send data about your photos..." would allow for much less private designs than this. If we want to move beyond "ask the user then do it", we need to get into the nitty gritty details here. I'd love to see more tech like this in consumer products, where it's private when used, even when opted-in.

    • I appreciate your passion for privacy-respecting technology and your clarification. You make good points about the nuances of privacy-preserving techniques. However, I think we can separate two distinct issues:

      1. The technical excellence of Apple's privacy protections (which you've explained well and seem robust)

      2. The ethical question of enabling data transmission by default

      Even with best-in-class privacy protections, the principle of user agency matters. A simplified prompt like "This feature will analyze your photos locally and send secure, anonymized data to Apple's servers to enable better search" would give users the basic choice while being technically accurate. The technical sophistication of the privacy measures, while commendable, doesn't override the need for informed consent.

    • This is not a matter of respect, it is a matter of ethics. Otherwise you will just end up rationalizating technocratic, unethical technology. No amount of passion will justify that.

    • The choice is between "use an online service" or "don't use an online service". That's simple enough for anyone to understand.

      Apple can try to explain as best it can how user data is protected when they use the online service, and then the user makes a choice to either use the service or not.

      In my case, I have don't even have a practical use for the new feature, so it's irrelevant how private the online service is. As it is, though, Apple silently forced me to use an online service that I never wanted.

> This is what a good privacy story looks like.

A good privacy story actually looks like not sending any info to anyone else anywhere at any time.

  • Your answer shows how we all have a very different idea of what our own desired privacy level is. Or what privacy even means.

    • If you think that sending data to a remote server is equally private to not sending it, then you are the one who doesn't know what privacy means.

      Of course it's fine to not desire privacy, or to desire a privacy level that is less than private. That's up to you. I liked the privacy of my old Canon digicam that had no internet. A photo app on a phone that sends stuff over the network might bring some useful functionality in return, but it can only be considered a regression in terms of privacy.

      1 reply →

  • Sure, but if we follow that line of thinking to its logical conclusion, we must move to a cabin in the woods, 100 miles from the nearest civilization, growing our own food and never connecting our computing devices to anything resembling a network.

    • No? You can have a photos app that doesn't phone home while not having to move to a cabin in the woods. See: every photos app that doesn't phone home, and I currently don't live in a cabin in the woods.

    • I've read the post you're responding to like 3 times, and after pondering it deeply, I'm pretty sure the conclusion of their line of thinking pretty definitively stops at "Apple should not be sending data off the device without the user requesting it." If you think otherwise, you should maybe provide more of an argument.

      9 replies →

    • It's not charitable to construct a contrived situation no one is talking about and place that into the mouth of a commenter.

    • Slippery slope fallacy. Nothing you said derives from not wanting to send information to remote servers, it's a false dichotomy.

> The author themselves looks to be an Apple security researcher

They’re not. Jeff Johnson develops apps (specifically Safari extensions) for Apple platforms and frequently blogs about their annoyances with Apple, but they’re not a security researcher.

Thank you for this comment. I found the author's ignorance to be fairly discrediting, and was surprised to find so many follow up comments equally railing on Apple.

Between the quote you pointed out and:

"One thing I do know, however, is that Apple computers are constantly full of privacy and security vulnerabilities, as proved by Apple's own security release notes" which just reeks of survivorship bias.

I think the final call of what is right here _shouldn't_ be informed by the linked article.

IMO, enabled by default without opt-in is absolutely the right call when judging between 1: Feature value 2: Security risk 3: Consent Fatigue.

If you're data-conscious enough to disagree with my prior statement, you should consider having lockdown mode enabled.

If you disagree with my prior statement because of how Apple locks you into Photos, :shake_hands:.

If Enhanced Visual Search is still enabled by default in lockdown mode, then I think that's worth a conversation.

  • > I found the author's ignorance to be fairly discrediting

    Why in the world am I supposed to be an expert on homomorphic encryption? How many people in the world are experts on homomorphic encryption?

    > which just reeks of survivorship bias.

    What does that even mean in this context?

    > 1: Feature value

    What is the value of the feature? As the article notes, this new feature is flying so low under the radar that Apple hasn't bothered to advertise it, and the Apple media haven't bothered to mention it either. You have to wonder how many people even wanted it.

    > If you're data-conscious enough to disagree with my prior statement, you should consider having lockdown mode enabled.

    That's ridiculous. Apple itself has said, "Lockdown Mode is an optional, extreme protection that’s designed for the very few individuals who, because of who they are or what they do, might be personally targeted by some of the most sophisticated digital threats. Most people are never targeted by attacks of this nature." https://support.apple.com/105120

    Lockdown mode is basically for famous people and nobody else.

    • > Why in the world am I supposed to be an expert on homomorphic encryption? How many people in the world are experts on homomorphic encryption?

      No one, at any point, implied you had to be an expert on homomorphic encryption. But if you're going to evaluate the security risk of a feature, and then end up on the front page of HN for said security risk, I think it's fair to criticize your lack of objectivity (or attempt at objectivity) by way of not even trying to understand the technical details of the blog.

      I will say I think my word choice was unnecessarily harsh, I'm sorry. I think I meant more indifference/inattention.

      > What does that even mean in this context?

      Apple's list of Security releases is long and storied. By comparison, the Solana Saga Web3 phone's list of security releases is short and succinct. Therefore, the Solana Saga must be more secure and has better security than an Apple device!

      > What is the value of the feature? As the article notes, this new feature is flying so low under the radar that Apple hasn't bothered to advertise it, and the Apple media haven't bothered to mention it either. You have to wonder how many people even wanted it.

      The marketability of a feature is not necessarily correlated with its value. Some features are simply expected and would be silly to advertise, i.e. the ability to check email or text friends. Other features are difficult to evaluate efficacy, so you release and collect feedback instead of advertising and setting false expectations.

      > Lockdown mode is basically for famous people and nobody else.

      Similar to Feature value, that audience of that statement is your average person (read: does not read/post on hacker news). Based off the your pedigree, I feel as though you probably know better, and given your "no tolerance for risk" for such a feature, it's something worth at least considering, and definitely isn't ridiculous.

      I think it's great you started this conversation. I disagree with your opinion, and that's okay!! But I don't think it's particularly beneficial to any discourse to 1. Imply that you are evaluating security risk 2. Be given a well written technical article so that you are able to make an informed decision (and then share that informed decision) 3. Ignore relevant information from said article, make an uninformed decision 4. Be surprised when someone says you made an uninformed decision 5. Imply the only way to make an informed decision would be to be an expert in the relevant fields from the technical article

      Anyway - thanks for writing and replying. Creating and putting yourself out there is hard (as evidenced by my empty blog that I promised I'd add to for the past 2 years). And my criticism was too harsh.

      3 replies →

  • Enhanced Visual Search was enabled despite my default lockdown mode. I worry about enhanced visual search capabilities much less than several of the other risky features that lockdown mode disables, but was a bit surprised by the default opt-in in my lockdown mode phone.

> This is what a good privacy story looks like.

A good privacy story starts with "Do you consent" and not transmitting a byte if you answer "no"

This sounds exactly like that CSAM "feature" they wanted to add but created a huge outrage because of how incredibly invasive it was.

It sounds like it only needs a few extra lines of code to get exactly what they wanted before, they just packaged it differently and we all fell for it like frogs getting boiled in water.

I’m deeply familiar with all of these techniques, the core issue here is informed consent which they have not obtained.

Furthermore, Apples privacy stance is generally a sham as their definition of “human rights” doesn’t extend to China. Which either means Apple doesn’t respect human rights, or they don’t view Chinese people as human.

  • Apple follows the law. First you need to get the Chinese government to respect those rights. The only other choice is to stop doing business entirely in the country.

    • A choice many companies have made. Apple is in China to make money, which is what a corporation is set up to do. My point is them claiming the moral high ground of a human rights defender is utterly laughable bullshit.

  • That's not really fair; Apple's in a sticky wicket when it comes to the Chinese government, and they're not the only ones.

    The Chinese government are debatably inhuman. They've literally censored the word "censorship." (Then they censored what people used euphemistically for censorship--"harmonious.") It's funny from the outside but also a miserable state of affairs in 2024.

    • It’s very fair, Apple has historically been very happy to be the sponsor of horrible human rights violations in their supply chain, only marginally paying attention to suicides in their factories when the PR got too bad.

      Apples version of “human right” includes suicide nets as an alternative to treating people humanely. That’s why their stance is pure marketing - they have blood on their hands.

      And guess what? You can’t use Google in China, and while Google isn’t by any means perfect, they aren’t Apple.

      2 replies →

The nearest neighbour search is sharded, which apple's blog admits is a privacy issue, which is why they're running the DP and OHTTP parts.

If apple were to add additional clusters that match "sensitive" content and endeavour to put them in their own shards distinct from landmarks, they defeat the homomorphic encryption part while still technically doing it.

The DP part can be defeated with just statistics over time; someone with any volume of sensitive content will hit these sensitive clusters with a higher likelihood than someone generateing noise injected fake searches.

The OHTTP part can be defeated in several ways, the simplest of which is just having a clause in a non-public contract allowing apple to request logs for some purpose. They're paying them and they can make up the rules as they go.

This must be the first consumer or commercial product implementing homomorphic encryption is it not?

I would be surprised if doing noisy vector comparisons is actually the most effective way to tell if someone is in front of the Eiffel tower. A small large language model could caption it just as well on device, my spider sense tells me someone saw an opportunity to apply bleeding edge, very cool tech so that they can gain experience and do it bigger and better in the future, but they're fumbling their reputation by doing this kind of user data scanning.

  • > This must be the first consumer or commercial product implementing homomorphic encryption is it not?

    Not really, it's been around for a bit now. From 2021:

    > The other major reason we’re talking about HE and FL now is who is using them. According to a recent repository of PETs, there are 19 publicly announced pilots, products, and proofs of concept for homomorphic encryption and federated analytics (another term for federated learning) combined. That doesn’t seem like a lot … but the companies offering them include Apple,7 Google, Microsoft, Nvidia, IBM, and the National Health Service in the United Kingdom, and users and investors include DARPA, Intel, Oracle, Mastercard, and Scotiabank. Also, the industries involved in these early projects are among the largest. Use cases are led by health and social care and finance, with their use in digital and crime and justice also nontrivial (figure 1).

    https://www2.deloitte.com/us/en/insights/industry/technology...

    I do wonder why we don't hear about it more often though. "Homomorphic encryption" as a buzzword has a lot of headline potential, so I'm surprised companies don't brag about it more.

    • But what are the products from them that implement HE and that consumers are using? Microsoft, IBM, Intel, and Google have all released libraries for HE, and there's Duality SecurePlu, but as far as actual consumer products, Apple's caller ID phone number lookup and other features in iOS 18 is very possibly the first.

      As far as why it's not more of a buzzword, it's far too in the weeds and ultimately consumers either trust you or they don't. And even if they don't trust you, many of them are still going to use Apple/Google/Facebook system anyway.

> This is what a good privacy story looks like.

I have an idea: send an encrypted, relayed, non-reversible, noised vector representation of your daily phone habits and interactions. That way you can be bucketed, completely anonymously of course, with other user cohorts for tracking, advertising, and other yet-to-be discovered purposes.

It's a great privacy story! Why would you have a problem with that?

  • What would be the value to the user in your scenario? In the photos app real scenario, it’s to enable a search feature that requires pairing photos with data not on the phone. (I understand you’re being sarcastic.)

    • Maybe we can do some analysis and optimize phone battery life based on your cohorts usage patterns.

      I don't know, I'm sure we'll figure something out once we have your data!

      3 replies →

  • They don't "have your data," even at an aggregated and noised level, due to the homomorphic encryption part.

    Restating the layers above, in reverse:

    - They don't see either your data or the results of the query (it's fully encrypted even from them where they compute the query -- this is what homomorphic encryption means)

    - Even if they broke the encryption and had your query data / the query result, they don't know who "you" are (the relay part)

    - Even if they had your query hash and your identity, they couldn't reverse the hash to identify which specific photos you have in your library (the client-side vectorization + differential privacy part), though by the this point they could know what records in the places database were hits. So they could know that you took a photo of a landmark, but only if the encryption and relay were both broken.

I am bit bit confused: Data is being sent to Apple, in such a way that it can not be traced back to the user. Apple does some processing on it. Then somehow magically, the pictures on your phone are updated with tags based on Apple's processing....but Apple doesn't know who you are.....

  • You joked, but you accidentally described what homomorphic encryption does. (if implemented correctly)

    > Then somehow magically, the pictures on your phone are updated with tags based on Apple's processing....but Apple doesn't know who you are.....

    Yes, this is the whole point.

  • There is a way to perform processing on encrypted data so the result is also encrypted and the person doing the processing never knows anything about the data that was processed on or the result (which can only be decrypted by the user with the original encryption keys)

    https://en.wikipedia.org/wiki/Homomorphic_encryption

    > Homomorphic encryption is a form of encryption that allows computations to be performed on encrypted data without first having to decrypt it. The resulting computations are left in an encrypted form which, when decrypted, result in an output that is identical to that produced had the operations been performed on the unencrypted data. Homomorphic encryption can be used for privacy-preserving outsourced storage and computation. This allows data to be encrypted and outsourced to commercial cloud environments for processing, all while encrypted

    And the way the data comes back to you is via the third-party relay which knows your IP but nothing else

    • Ok, that's the step that was missing. I couldn't figure out how there was a benefit to the users without data being fed back and data can't be fed back without knowing some ID.

      So, while Apple doesn't know the ID of the person sending the data, they have a 'room number' that links back to an ID.

      If Apple were to decide to scan photos for pictures of 'lines of white powder' they couldn't tell the police your name but they could say that the 3rd party knows who you are.

For context, @scosman is self-described as “Formerly Apple Photos” in his Twitter bio.

The devil is in the differential privacy budget. I am in Japan and I’ve taken hundreds of photos this week. What does that budget cover?

> - OHTTP relay: it's sent through a 3rd party so Apple never knows your IP address. The contents are encrypted so the 3rd party never doesn't learn anything either (some risk of exposing "IP X is an apple photos user", but nothing about the content of the library).

Which 3rd party is that?

  • I don't have a list on hand, but at least Cloudflare and Akamai are part of the network hops. Technically you only need 2 hops to make sure no origin or data extraction can be done.

>There are other tools to provide privacy (DP, homomorphic encryption), while also using services. They are immensely complicated, and user's can't realistically evaluate risk.

It is simple for any user to evaluate risk the risk of their data being breached on 3rd party servers when their data isn't being sent off the device - there is none. It is only when corporations insist that they are going to send the data off your device whether you like it or not that evaluating risk becomes necessary.

Regarding HE: since the lookup is generated by the requestor, it can be used as an adversarial vector, which can result in exfiltration by nearest neighbor (closest point to vector) methods. In other words, you can change what you are searching for, and much like differential power analysis attacks on crypto, extract information.

This may be a "good" privacy story but a way better one is to just not send any of your data anywhere, especially without prior consent.

The best you can hope is integrity and security until your information reaches the destination but to assume that Apple or the U.S government cannot decipher the information you sent it or use it against you(i.e. set a person of interest as "landmark" and find out who's iPhone matches that "landmark) you must be foolish.

It's no longer a conspiracy. I think we are all over past that time(i.e with Snowden and Wikileaks). We live in a surveillance world and "They're guarding all the doors and holding all the keys".

> This is what a good privacy story looks like.

Not at all. A good privacy story is not sending this data anywhere.

> I'm not sure what the right call is here.

I am sure.

The right call is to never send any data from the device to anyone unless the user explicitly tells the device to do it.

The only thing the device should do is whatever its user tells it to do.

The user didn't tell it to do this. Apple did.

> But the conclusion "Thus, the only way to guarantee computing privacy is to not send data off the device." isn't true

Irrelevant. It was never about privacy to begin with. It was always about power, who owns the keys to the machine, who commands it.

Vectorization, differential privacy, relays, homomorphic encryption, none of it matters. What matters is the device is going behind the user's back, doing somebody else's bidding, protecting somebody else's interests. That they were careful about it offers little comfort to users who are now aware of the fact "their" devices are doing things they weren't supposed to be doing.

  • Complete nonsense. *All networked devices do things behind their users back* at this point, and have for years, and do not ask for consent for most of it. And users would REJECT granular opt-in as a terrible UX.

    Let's look at the primary alternative, Android. It generally does not provide you this level of granular control on network access either without rooted hacks. Apps and the phone vendor can do whatever they want with far less user control unless you're a deep Android nerd and know how to install root-level restriction software.

    • Yes, apps going behind people's back and exfiltrating personal information has become normal. That's not an argument, it's merely a statement of fact. This shouldn't be happening at all. The fact it got to this point doesn't imply it shouldn't be stopped.

      No one's advocating for granular opt in either. There are much better ways. We have to make it so that data is toxic to corporations. Turn data into expensive legal liabilities they don't want to deal with. These corporations should not even be thinking about it. They should be scrambling to forget all about us the second we are done with them, not covertly collecting all the data they possibly can for "legitimate" purposes. People should be able to use their computers without ever worrying that corporations are exploiting them in any way whatsoever.

      The Android situation is just as bad, by the way. Rooting is completely irrelevant. You may think you can hack it but if you actually do it the phone fails remote attestation and the corporations discriminate against you based on that, usually by straight up denying you service. On a very small number of devices, Google's phones ironically, you can access those keys and even set your own. And it doesn't matter, because the corporations don't trust your keys, they only trust the keys of other corporations. They don't care to know your device is secure, they want to know it's fully owned by Google so that you can't do things the corporations don't like.

      It's not something that can be solved with technology. Computer freedom needs to become law.

> if you want features that require larger-than-disk datasets, or frequently changing content, you need tools like this.

Well I want them to fuck off.

Hidden in your commentary here is the fact that the vector representation of the image is _the contents of the image_. It very well may be that they cannot reverse the exact image. But it’s still a representation of the image that has to be good for something. Without being too familiar I would be willing to hazard a guess that this could include textual labels and classifications of what is in the image.

I don’t give a shit how good your internal controls are. I don’t have anything particularly interesting to hide. I still do not want you taking my pictures.

  • If you read the research you'd know that they don't have access to the vector either. They never decrypt the data. All operations on their server are done directly on the encrypted data. They get 0 information about your photos. They cannot even see which landmark your vector was closest to.

    • I don’t care! I do not want them helping themselves to representations of my data. I don’t care if it’s encrypted or run through a one way hash. I don’t care if they only interact with it via homomorphic methods. They can, again, fuck the hell off.

      A private corporation has no business reading my files for its own benefit.

      1 reply →

> "It ought to be up to the individual user to decide their own tolerance for the risk of privacy violations." -> The author themselves looks to be an Apple security researcher, and are saying they can't make an informed choice here

I don’t think that that’s what the author is saying at all, I think he’s saying that Apple should let the user decide for themself if they want to send all this shit to Apple, freedom for the individual. They’re not saying “I dunno”

this is what gaslighting looks like

how about they don't send anything about my photos to their servers and i get to keep my shit on my own device

i suppose we're past that to the point where techbros like you will defend personal data exfiltration because.. uhh idk? trillion dollar corporation knows best?

So what? Why should the application talk over the Internet to begin with? And why isn't that functionality off by default under a settings option that clearly warns the user of the consequences? I think you're missing the forest for the trees here.

And the claims that this is good privacy/security are not at all obvious either. And who are those third-parties anyway? Did you verify each one of them?

Quantum makes the homomorphic stuff ineffective in the mid-term. All they have to do is hold on to the data and they can get the results of the lookup table computation, in maybe 10-25 years. Shouldn't be on by default.

  • What makes you think that this is the biggest problem if things like AES and RSA are suddenly breakable?

    If someone wanted to get a hold of your cloud hosted data at that point, they would use their capacity to simply extract enough key material to impersonate a Secure Enclave. That that point, you "are" the device and as such you "are" the user. No need to make it more complicated than that.

    In theory, Apple and other manufacturers would already use PQC to prevent such scenarios. Then again, QC has been "coming soon" for so long, it's doubtful that any information that is currently protected by encryption will still be valuable by the time it can be cracked. Most real-world process implementations don't rely on some "infinite insurance", but assume it will be breached at some point and just try to make it difficult or costly enough to run out the clock on confidentiality, which is all that really matters. Nothing that exists really needs to be confidential forever. Things either get lost/destroyed or become irrelevant.

    • This is ostensibly for non-cloud data, derivatives of it auto uploaded after an update.

The right call is to provide the feature and let users opt-in. Apple knows this is bad, they've directly witnessed the backlash to OCSP, lawful intercept and client-side-scanning. There is no world in which they did not realize the problem and decided to enable it by default anyways knowing full-well that users aren't comfortable with this.

People won't trust homomorphic encryption, entropy seeding or relaying when none of it is transparent and all of it is enabled in an OTA update.

> This is what a good privacy story looks like.

This is what a coverup looks like. Good privacy stories never force third-party services on a user, period. When you see that many puppets on stage in one security theater, it's only natural for things to feel off.

  • > This is what a coverup looks like.

    That’s starting to veer into unreasonable levels of conspiracy theory. There’s nothing to “cover up”, the feature has an off switch right in the Settings and a public document explaining how it works. It should not be on by default but that’s not a reason to immediately assume bad faith. Even the author of the article is concerned more about bugs than intentions.

    • > the feature has an off switch right in the Settings and a public document explaining how it works

      Irrelevant.

      This is Apple's proprietary software, running on Apple's computers, devices which use cryptography to prevent you from inspecting it or running software they don't control. Very few people have any idea how it actually works or what it actually does.

      That there's some "public document" describing it is not evidence of anything.

      > that’s not a reason to immediately assume bad faith

      The mere existence of this setting is evidence of bad faith. The client side scanning nonsense proved controversial despite their use of children as political weapons. That they went ahead and did this despite the controversy removes any possible innocence. It tells you straight up that they cannot be trusted.

      > Even the author of the article is concerned more about bugs than intentions.

      We'll draw our own conclusions.

    • Would you feel the same if Microsoft turned on Recall on all Windows PCs everywhere with an update?

      They worked very hard on security these past few months, so it should be all good, right?

      3 replies →

  • > This is what a coverup looks like

    This is a dumb take. They literally own the entire stack Photos runs on.

    If they really wanted to do a coverup we would never know about it.

    • Why wouldn't this mistake be addressed in a security hotfix then? Apple has to pick a lane - this is either intended behavior being enabled against user's wishes, or unintended behavior that compromises the security and privacy of iPhone owners.

      3 replies →

  • It's not that binary. Nobody is forcing anything, you can not buy a phone, you can not use the internet. Heck, you can even not install any updates!

    What is happening, is that people make tradeoffs, and decide to what degree they trust who and what they interact with. Plenty of people might just 'go with the flow', but putting what Apple did here in the same bucket as what for example Microsoft or Google does is a gross misrepresentation. Present it all as equals just kills the discussion, and doesn't inform anyone to a better degree.

    When you want to take part in an interconnected network, you cannot do that on your own, and you will have to trust other parties to some degree. This includes things that might 'feel' like you can judge them (like your browser used to access HN right here), but you actually can't unless you understand the entire codebase of your OS and Browser, all the firmware on the I/O paths, and the silicon it all runs on. So you make a choice, which as you are reading this, is apparently that you trust this entire chain enough to take part in it.

    It would be reasonable to make this optional (as in, opt-in), but the problem is that you end up asking a user for a ton of "do you want this" questions, almost every upgrade and install cycle, which is not what they want (we have had this since Mavericks and Vista, people were not happy). So if you can engineer a feature to be as privacy-centric yet automated as possible, it's a win for everyone.

    • > What is happening, is that people make tradeoffs, and decide to what degree they trust who and what they interact with.

      People aren't making tradeoffs - that's the problem. Apple is making the tradeoffs for them, and then retroactively asking their users "is this okay?"

      Users shouldn't need to buy a new phone to circumevent arbitrary restrictions on the hardware that is their legal property. If America had functional consumer protections, Apple would have been reprimanded harder than their smackdowns in the EU.

      3 replies →

    • I don't want my photos to take part in any network; never asked for it, never expected it to happen. I never used iCloud or other commercial cloud providers. This is just forceful data extraction by Apple, absolutely egregious behavior.

      10 replies →