Comment by Gabriel54
1 month ago
I appreciate the explanation. However, I think you do not address the main problem, which is that my data is being sent off my device by default and without any (reasonable) notice. Many users may agree to such a feature (as you say, it may be very secure), but to assume that everyone ought to be opted in by default is the issue.
I'm not sure I agree -- asking users about every single minor feature is (a) incredibly annoying, and (b) quickly causes request-blindness in even reasonably security-conscious users. So restraining the nagging for only risky or particularly invasive things makes sense to me.
Maybe they should lump its default state into something that already exists? E.g. assume that if you already have location access enabled for Photos (it does ask!), you've already indicated that you're okay with something about this identifying being sent to Apple whenever you take a picture.
My understanding is that Location Services will, among other things, send a hash of local WiFi network SSIDs and signal strengths to a database Apple maintains, and use that to triangulate a possible position for you. This seems loosely analogous to what's going on here with the compute-a-vector thing.
> Maybe they should lump its default state into something that already exists?
It could be tied to iCloud Photos, perhaps, because then you already know that your photos are getting uploaded to Apple.
Insofar as the photos aren't getting uploaded to Apple for this, that seems a bit extreme.
(We could argue about it, but personally I think some kind of hash doesn't qualify.)
9 replies →
"asking users about every single minor feature is (a) incredibly annoying"
Then why lie and mislead customers that your data stays local?
I don't think that's a fair characterization of what they're doing.
4 replies →
> asking users about every single minor feature
Then perhaps the system is of poor design and needs further work before being unleashed on users…
Especially for a company which heavily markets about how privacy-focused it is,
1)sending my personal data to them in any way is not a "feature." It's especially not a feature because what it sets out to do is rather unnecessary because every photo has geotagging, time-based grouping, and AI/ML/whatever on-device keyword assignments and OCR. I can open up my phone right now and search for every picture that has grass in it. I can search for "washington" and if I took a picture of a statue of george washington that shows the plaque, my iPhone already OCR'd that and will show the photo.
2)"minor" is not how I would ever describe sending data based off my photos to them, regardless of how much it's been stuffed through a mathematical meat grinder.
3)Apple is usually very upfront about this sort of thing, and also loves to mention the most minor, insignificant, who-gives-a-fuck feature addition in the changenotes for "point" system updates. We're talking things like "Numbers now supports setting font size in chart legends" (I'm making that up but you get the point.)
This was very clearly an "ask for forgiveness because the data we want is absolutely priceless and we'll get lots of it by the time people notice / word gets out." It's along the lines of Niantic using the massive trove of photos from the pokemon games to create 3d maps of everywhere.
I specifically use iOS because I value my privacy (and don't want my cell phone data plan, battery power, etc to be a data collection device for Google.) Sending data based off my photos is a hard, do-not-pass-go-fuck-off-and-die line in the sand for me.
It's especially shitty because they've gated a huge amount of their AI shit behind owning the current iPhone model....but apparently my several generation old iPhone is more than good enough to do some AI analysis on all my photos, to upload data for them?
Fuck everyone Apple who was involved in this.
> This was very clearly an "ask for forgiveness because the data we want is absolutely priceless and we'll get lots of it by the time people notice / word gets out.
It's very clearly not, since they've gone to huge lengths to make sure they can't actually see the data themselves see the grandparent post.
1 reply →
> It's especially shitty because they've gated a huge amount of their AI shit behind owning the current iPhone model....but apparently my several generation old iPhone is more than good enough to do some AI analysis on all my photos
Hear hear. As if they can do this but not Visual Intelligence, which is just sending a photo to their servers for analysis. Apple has always had artificial limitations but they've been getting more egregious of late.
I think it does address the main problem. What he is saying is that multiple layers of security is used to ensure (mathematically and theoretically proved) that there is no risk in sending the data, because it is encrypted and sent is such a way that apple or any third party will never be able to read/access it (again, based on theoretically provable math) . If there is no risk there is no harm, and then there is a different need for ‘by default’, opt in/out, notifications etc.
The problem with this feature is that we cannot verify that Apple’s implementation of the math is correct and without security flaws. Everyone knows there is security flaws in all software, and this implementation is not open (I.e. we cannot review the code, and even if we could review code we cannot verify that the provided code was the code used in the iOS build). So, we have to trust Apple did not make any mistakes in their implementation.
Your second paragraph is exactly the point made in the article as the reason why it should be an informed choice and not something on by default.
If you don’t trust Apple to do what they say they do, you should throw your phone in the bin because it has total control here and could still be sending your data even if you opt out.
2 replies →
As someone with a background in mathematics I appreciate your point about cryptography. That said, there is no guarantee that any particular implementation of a secure theoretical algorithm is actually secure.
There is also no guarantee that Apple isn't lying about everything.
They could just have the OS batch uploads until a later point e.g. when the phone checks for updates.
The point is that this is all about risk mitigation not elimination.
19 replies →
You're welcome to check their implementation yourself:
https://github.com/apple/swift-homomorphic-encryption
Hypothetical scenario: Theo de Raadt and Bruce Schneier are hired to bring Apple products up to their security standards. They are given a public blog, and they are not required to sign an NDA. They fix every last vulnerability in the architecture. Vladimir Putin can buy MacBooks for himself and his generals in Moscow, enable Advanced Data Protection, and collaborate on war plans in total confidence.
Where are the boundaries in this scenario?
Theo de Raadt is less competent than Apple's security team (and its external researchers). The main thing OpenBSD is known for among security people is adding random mitigations that don't do anything because they thought them up without talking to anyone in the industry.
1 reply →
Freedom of speech can not exist without private communications. It is an inalieanable right, therefore privacy is as well.
I am pretty sure that if we had those people in charge of stuff like this there would be no bar above which "opt in by default" would happen, so I am unsure of your point?
Except for the fact (?) that quantum computers will break this encryption so if you wanted to you could horde the data and just wait a few years and then decrypt?
Quantum computers don't break Differential Privacy. Read the toy example at https://security.googleblog.com/2014/10/learning-statistics-...
>Let’s say you wanted to count how many of your online friends were dogs, while respecting the maxim that, on the Internet, nobody should know you’re a dog. To do this, you could ask each friend to answer the question “Are you a dog?” in the following way. Each friend should flip a coin in secret, and answer the question truthfully if the coin came up heads; but, if the coin came up tails, that friend should always say “Yes” regardless. Then you could get a good estimate of the true count from the greater-than-half fraction of your friends that answered “Yes”. However, you still wouldn’t know which of your friends was a dog: each answer “Yes” would most likely be due to that friend’s coin flip coming up tails.
> Except for the fact (?) that quantum computers will break this encryption […]
Quantum computers will make breaking RSA and Diff-Hellman public key encryption easier. They will not effect things like AES, nor things like hashing:
> Client side vectorization: the photo is processed locally, preparing a non-reversible vector representation before sending (think semantic hash).
And for RSA and DH, there are algorithms being deployed to deal with that:
* https://en.wikipedia.org/wiki/NIST_Post-Quantum_Cryptography...
Quantum computers don't and won't meaningfully exist for a while, and once they do exist, they still won't be able to crack it. Quantum computers aren't this magical "the end is nigh" gotcha to everything and unless you're that deep into the subject, the bigger question you've got to ask yourself is why is a magic future technology so important to you that you just had to post your comment?
Anyway, back to the subject at hand; here's Apple on that subject:
> We use BFV parameters that achieve post-quantum 128-bit security, meaning they provide strong security against both classical and potential future quantum attacks
https://machinelearning.apple.com/research/homomorphic-encry...
https://security.apple.com/blog/imessage-pq3/
I’m a cryptographer and I just learned about this feature today while I’m on a holiday vacation with my family. I would have loved the chance to read about the architecture, think hard about how much leakage there is in this scheme, but I only learned about it in time to see that it had already been activated on my device. Coincidentally on a vacation where I’ve just taken about 400 photos of recognizable locations.
This is not how you launch a privacy-preserving product if your intentions are good, this is how you slip something under the radar while everyone is distracted.
In engineering we distinguish the "how" of verification from the "why" of validation; it looks like much comments disagreement in this post is about the premise of whether ANY outgoing data counts as a privacy consent issue. It's not a technical issue, it's a premises disagreement issue and that can be hard to explain to the other side.
The premise of my disagreement is that privacy-preserving schemes should get some outside validation by experts before being turned on as a default. Those experts don’t have to be me, there are plenty of people I trust to check Apple’s work. But as far as I can tell, most of the expert community is learning about this the same way that everyone else is. I just think that’s a bad way to approach a deployment like this.
1 reply →
To play Apple's advocate, this system will probably never be perfect, and stand up to full scrutinity from everyone on the planet. And they also need the most people possible activated as it's an adverserial feature.
The choice probably looks to them like:
Wether B works is up to debate, but that was probably their only chance to have it ship from their POV.
To give you feedback in your role as Apple's advocate:
"we had to sneak it out because people wouldn't consent if we told them" isn't the best of arguments
1 reply →
Did a variation of A already happen in 2022, with "client-side scanning of photos"?
1 reply →
I think I'm saying: you're not sending "your data" off device. You are sending a homomorphically encrypted locally differentially private vector (through an anonymous proxy). No consumer can really understand what that means, what the risks are, and how it would compare to the risk of sending someone like Facebook/Google raw data.
I'm asking: what does an opt in for that really look like? You're not going to be able to give the user enough info to make an educated decision. There's ton of risk of "privacy washing" ("we use DP" but at very poor epsilon, or "we use E2E encryption" with side channel data gathering).
There's no easy answer. "ask the user", when the question requires a phd level understanding of stats to evaluate the risk isn't a great answer. But I don't have another one.
In response your second question, opt in would look exactly like this: don't have the box checked by default, with an option to enable it: "use this to improve local search, we will create an encrypted index of your data to send securely to our servers, etc..." A PhD is not necessary to understand the distinction between storing data locally on a machine vs. on the internet.
Even here with HN crowd: it's not an index, it's not stored on a server, and it's not typical send-securely encryption (not PK or symmetric "encrypted in transit", but homomorphic "encrypted processing"). Users will think that's all gibberish (ask a user if they want to send an index or vector representation? no clue).
Sure, you can ask users "do you want to use this". But why do we ask that? Historically it's user consent (knowingly opting in), and legal requirements around privacy. We don't have that pop up on any random new feature, it's gated to ones with some risk. There are questions to ask: does this technical method have any privacy risk? Can the user make informed consent? Again: I'm not pitching we ditch opt-in (I really don't have a fix in mind), but I feel like we're defaulting too quickly to "old tools for new problems". The old way is services=collection=consent. These are new privacy technologies which use a service, but the privacy is applied locally before leaving your device, and you don't need to trust the service (if you trust the DP/HE research).
End of the day: I'd really like to see more systems like this. I think there were technically flawed statements in the original blog article under discussion. I think new design methods might be needed when new technologies come into play. I don't have any magic answers.
5 replies →
I Think the best response is make it how iCloud storage works. The option is keep my stuff on the local device or use iCloud.
Exactly. It's the height of arrogance to insist that normal users just can't understand such complex words and math, and therefore the company should not have to obtain consent from the user. As a normal lay user, I don't want anything to leave my device or computer without my consent. Period. That includes personal information, user data, metadata, private vectors, homomorphic this or locally differential that. I don't care how private Poindexter assures me it is. Ask. For. Consent.
Don't do things without my consent!!! How hard is it for Silicon Valley to understand this very simple concept?
20 replies →
There is significant middle ground between "do it without asking" and "ask about every single thing". A reasonable option would be "ask if the device can send anonymized data to Apple to enable such and such features". This setting can apply to this specific case, as well as other similar cases for other apps.
Asking the user is perfectly reasonable. Apple themselves used to understand and champion that approach.
https://www.youtube.com/watch?v=39iKLwlUqBo
If you can't meaningfully explain what you're doing then you can't obtain informed consent. If you can't obtain informed consent then that's not a sign to go ahead anyway, it's a sign that you shouldn't do it.
This isn't rocket surgery.
+100 for "rocket surgery".
I mostly agree. I'm just annoyed "this new privacy tech is too hard to explain" leads to "you shouldn't do it". This new privacy tech is a huge net positive for users.
Also: from other comments sounds like it might have been opt-in the whole time. Someone said a fresh install has it off.
1 reply →
I don't care if all they collect is the bottom right pixel of the image and blur it up before sending it, the sending part is the problem. I don't want anything sent from MY device without my consent, whether it's plaintext or quantum proof.
You're presenting it as if you have to explain elliptic curve cryptography in order to toggle a "show password" dialogue but that's disingenuous framing, all you have to say is "Allow Apple to process your images", simple as that. Otherwise you can argue many things can't possibly be made into options. Should location data always be sent, because satellites are complicated and hard to explain? Should we let them choose whether they can turn wifi on or off, because you have to explain IEEE 802.11 to them?
> I don't want anything sent from MY device without my consent
Then don’t run someone else’s software on your device. It’s not your software, you are merely a licensee. Don’t delude yourself that you are morally entitled to absolute control over it.
The only way to have absolute control over software is with an RMS style obsession with Free software.
17 replies →
Notice is always good and Apple should implement notice.
However, "my data is being sent off my device" is incorrect, as GP explained. Metadata, derived from your data, with noise added to make it irreversible, is being sent off your device. It's the equivalent of sending an MD5 of your password somewhere; you may still object, but it is not factually correct to say your password was transmitted.
> It's the equivalent of sending an MD5 of your password somewhere; you may still object, but it is not factually correct to say your password was transmitted.
Hackers love to have MD5 checksums of passwords. They make it way easier to find the passwords in a brute force attack.
https://en.wikipedia.org/wiki/Rainbow_table
>> It's the equivalent of […]
> Hackers love to have MD5 checksums of passwords.
Hackers love not understanding analogies. :)
1 reply →
Nobody responding seriously to this because you seem to have missed the part where GP said "with noise added to make it irreversible" and the third sentence in that wikipedia article.
Hackers don’t know about salts yet?
1 reply →
> However, "my data is being sent off my device" is incorrect, as GP explained. Metadata, derived from your data, with noise added to make it irreversible, is being sent off your device.
Sounds like my data is being sent off my device.
> It's the equivalent of sending an MD5 of your password somewhere
Sounds even worse lol
It does not sound like that at all.
There is plenty of data on your device that isn’t “your data” simply due to existing on your device.
[flagged]
If the information being sent from my advice cannot be derived from anything other than my own data then it is my data. I don't care what pretty dress you put on it.
> It's the equivalent of sending an MD5 of your password somewhere
a) MD5 is reversible, it just cost GPU time to brute force
b) It is unproven that their implementation is irreversible
BFV has been proven to be irreversible, and Apple open sourced their Swift library implementing it, so it's not totally unproven.
https://github.com/apple/swift-homomorphic-encryption
1 reply →
Well that's what you're told is happening. As it's all proprietary closed source software that you can't inspect or look at or verify in any manner, you have absolutely zero evidence whether that's what's actually happening or not.
If you can't inspect it that just means you don't know how to use Ghidra/Hopper. ObjC is incredibly easy to decompile and Swift isn't much harder.
"Your data" is not actually being sent off your device, actually, it is being scrambled into completely unusable form for anyone except you.
This is a much greater level of security than what you would expect from a bank, for example, who needs to fully decrypt the data you send it. When using your banking apps over HTTPS (TLS), you are trusting the CA infrastructure, you are trusting all sorts of things. You have fewer points of failure when a key for homomorphic encryption resides only on your device.
"Opting-in by default" is therefore not unsafe.
I guess it depends on what you're calling "your data" -- without being able to reconstruct an image from a noised vector, can we say that that vector in any way represents "your data"? The way the process works, Apple makes their own data that leaves your device, but the photo never does.
It's the same as the CSAM initiative. It doesn't matter what they say they send, you cannot trust them to send what they say they send or trust them not to change it in the future.
Anything that leaves my devices should do so with my opt-IN permission.
Even if they implemented the feature with opt-in permissions, why would you trust this company to honor your negative response to the opt-in?
How would you explain client side vectorization, differential privacy and homomorphic encryption to a layman in a single privacy popup so that they can make an informed choice?
Or is it better to just trust that mathematics works and thus encryption is a viable way to preserve privacy and skip the dialog?
The big mistake here is ownership of your apple devices is an illusion...
Do you consider your data to include non-reversible hashes of your data injected with random noise? I'm not sure I consider that my data. Its also not even really meta-data about my data.
Do you use iCloud to store your photos?
I’m not the person you asked, but I agree with them. To answer your question: No, I do not use iCloud to store my photos. Even if I did, consent to store data is not the same as consent to scan or run checks on it. For a company whose messaging is all about user consent and privacy, that matters.
This would be easily solvable: On first run show a window with:
> Hey, we have this new cool feature that does X and is totally private because of Y [link to Learn More]
> Do you want to turn it on? You can change your mind later in Settings
> [Yes] [No]
When iCloud syncs between devices how do you think that happens without storing some type of metadata?
You don’t use iCloud for anything? When you change phones do you start fresh or use your computer for backups? Do sync bookmarks? Browsing history?
Do you use iMessage?
36 replies →
I hate this type of lukewarm take.
"Ah, I see you care about privacy, but you own a phone! How hypocritical of you!"
You’re describing Matt Bors’ Mister Gotcha.
https://thenib.com/mister-gotcha/
If you care about your “privacy” and no external service providers having access to your data - that means you can’t use iCloud - at all, any messages service, any back up service, use Plex and your own hosted media, not use a search engine, etc.
9 replies →
When your phone sends out a ping to search for cellular towers, real estate brokers collect all that information to track everywhere you go and which stores you visit.
Owning a phone is a privacy failure by default in the United States.
> When your phone sends out a ping to search for cellular towers, real estate brokers collect all that
Care to provide a pointer to what device they are using? I would absolutely get my real estate license for this.
You are being downvoted because you're so painfully correct. It's not an issue exclusive to the United States, but American intelligence leads the field far-and-away on both legal and extralegal surveillance. The compliance forced by US Government agencies certainly helps make data tracking inescapable for the average American.
Unfortunately, the knee-jerk reaction of many defense industry pundits (and VCs, for that matter) is that US intelligence is an unparalleled moral good, and the virtues of privacy aren't worth hamstringing our government's work. Many of these people will try to suppress comments like yours because it embarrasses Americans and American business by association. And I sympathize completely - I'm dumbfounded by the response from my government now that we know China is hacking our telecom records.
FWIW, SS7 had known flaws very long ago.
It's apparent it has been kept in place because of all of the value it provides to the 5 eyes.