Comment by joshstrange
7 days ago
This is almost certainly the answer and clever as hell. You just have to make sure the server storing the firmware (which you control) has the right CORS headers (as you mention) and you are in business.
This means that the CarPlay device has no "internet" (spoiler: it never had real internet access) unless you are on that page interacting with it.
I'm not sure how these devices work, I mean I know they broadcast themselves as a CarPlay head unit then "somehow" pass that to the car via a wired connection (pretending to be a phone connecting via USB). "somehow" being the important part. Does it hand along an encrypted stream that it can't decode or does it decode/re-encode?
Either way I'd bet these devices are pretty safe to use. The phone sends a video feed, not raw "data" so the MitM (again, if that's how it works) would need to OCR the video to get anything useful since the raw video would be too large to store and too heavy to transfer over cellular (via it's own hidden radio, again, worst-case-scenario).
If the device decodes the stream in the middle then the worst case I can think of is it could be doing on-device OCR and cellular radio to exfiltrate the text but I feel confident that you could spot the cellular radio (or someone who did a teardown). Without the radio it has no way to get data off the device which means the best it could do it sneak some out while you were on that update screen. Though I think that's all pretty far-fetched.
EDIT: I went looking for some way to act as a CarPlay receiver and get the raw video feed and it looks like it's possible [0] so yeah, a malicious device could proxy the connect, OCR the result, and send data via its own cellular connection but that would be relatively easy to detect and not worth it unless you are the target of a nation state which, at that point, you have bigger problems.
> Does it hand along an encrypted stream that it can't decode or does it decode/re-encode?
It definitely does decode/re-encode audio streams, as music playback quality suffers quite a bit (both latency and quality).
If you want to capture what's going on, you don't need 120fps video. Take a low-res snapshot every 5-10 minutes and send it off. It doesn't need OCR or anything fancy. That's still a ton of information, with very little bandwidth.