Comment by dlenski
5 hours ago
Agreed. I truly don't understand including these researchers on this list.
Some of Sweeney's most well-known work in this area is from the LATE 1990s. She was sounding the alarm about problems with anonymized data in medical datasets: https://en.wikipedia.org/wiki/Latanya_Sweeney#Medical_datase...
Her work almost certainly contributed highly to awareness of these risks.
More recently she has apparently worked on things like protecting voting rights in the US by notifying voters if their registration records change.
I haven't followed what she's been working on recently.
But, yeah, at some point in the 90s, Massachusetts decided to release some "anonymized" health records for research purposes (I think just state employees). One was governor William Weld who obviously had a lot of public information widely available. As I recall, Sweeney wrote the governor's office a bit later basically saying "I have your medical records."
I used this as a slide or two in some AI presentations in the mid-2000s or so pre-LLMs when I had some peripheral involvement with some of the privacy-preserving research going on (differential privacy, multiparty computation, fully homomorphic encryption). Haven't really followed most of this for a while.
See also: AOL's search data release
https://en.wikipedia.org/wiki/AOL_search_log_release
As I somewhat recall there also an issue with correlating IMDB with Netflix ratings at one point.