Comment by CWuestefeld

14 years ago

His concern is not that OnStar will fail to remove your name from the GPS location stream. It is that even without a name attached, the subject's identity can be readily inferred from the data itself.

If one looks at a stream of location data over time, and sees the recurrence of a particular location in a residential area, particularly at night, then it can be pretty well surmised that this is your home. And from that, it's a trivial step to get your identity. And bingo, the anonymized data is now re-identified.

There's a simple solution to that: don't give a stream of location data. Chop it up into 5-second fragments, and fuzz the data by a meter or so to prevent re-assembly.

That would still be a very valuable dataset (for me at least), and almost completely free of PII.

Than again, I'm not an expert in these things; am I missing some way that this could be deanonymized?

  • Adding a meter to the GPS location of where my car starts and stops at the end of each day still tells you where my house is.

  • Even if you removed any IDs from the data and sufficiently fuzzed the location, speed, and timestamps, you are still left with a heatmap of where cars with OnStar drive most frequently.

    In a city, that is probably anonymous. If you are in a rural area or drive along a route where your car makes up the majority of the data points, it still isn't.