← Back to context

Comment by CWuestefeld

14 years ago

His concern is not that OnStar will fail to remove your name from the GPS location stream. It is that even without a name attached, the subject's identity can be readily inferred from the data itself.

If one looks at a stream of location data over time, and sees the recurrence of a particular location in a residential area, particularly at night, then it can be pretty well surmised that this is your home. And from that, it's a trivial step to get your identity. And bingo, the anonymized data is now re-identified.

There's a simple solution to that: don't give a stream of location data. Chop it up into 5-second fragments, and fuzz the data by a meter or so to prevent re-assembly.

That would still be a very valuable dataset (for me at least), and almost completely free of PII.

Than again, I'm not an expert in these things; am I missing some way that this could be deanonymized?

  • Adding a meter to the GPS location of where my car starts and stops at the end of each day still tells you where my house is.

  • Even if you removed any IDs from the data and sufficiently fuzzed the location, speed, and timestamps, you are still left with a heatmap of where cars with OnStar drive most frequently.

    In a city, that is probably anonymous. If you are in a rural area or drive along a route where your car makes up the majority of the data points, it still isn't.