Comment by LeifCarrotson
9 months ago
You too can be a data broker!
for (i = 0; i < 900000000; i++)
insert(first: random_firstname(), last: random_lastname(), ssn: i);
Does anyone really really care if the name is accurate if the SSN is present? More than half of the SSNs in the above dataset are valid.
You probably are posting this as a joke, but without a clear technical solution to this problem, flooding the industry with bullshit data seems like a great avenue.
I have a silly standup joke along these lines, about how I'd Google things crazy things like "circus lawyer" or "giraffe mitigation tactics" to throw the algorithm off every now and then.
My friend is a thriller writer and is convinced he’s on some FBI list. He’s googling stuff such as “how to dissolve a body with quicklime” and all sorts of other fun stuff while researching for his books.
3 replies →
that was the idea behind certain applications and add-ons that would browse around to popular websites and randomly click ads so that marketers couldn't tell your actual interests from fake ones.
Unfortunately that strategy is deeply flawed and dangerous because nobody cares if the data they have on you is accurate or not. They still can, and still will, use it against you at every opportunity. Every scrap of data they have, accurate or not, can be used to hurt you.
The only way to flood data brokers with garbage data that can't hurt anyone is to fill it with entirely fictitious people who somehow can't be mistaken for any actual people. Even that runs the risk of hurting real people though. For example, an insurance company might go to a data broker and ask for the number of people within a certain neighborhood or zip code who bought fast food more than once a week in the last year and how many have a gym membership. If the number of frequent fast food buyers is higher than it was last year and/or the number of gym members is lower the insurance company might decide to raise the rates of every single member within that neighborhood or zip code. Even fake people could skew those numbers if their fake data said they lived in those zip codes or neighborhood and ate out a lot or didn't have a gym membership. Indirectly, the fact person is mistaken for being a real one in that community.
The best way to deal with data brokers is to regulate them with strong data protection laws. Anything you give them risks hurting someone and gives them another data point to sell.
> might decide to raise the rates of every single member within that neighborhood or zip code
Wouldn't that be against redlining laws? https://en.wikipedia.org/wiki/Redlining
1 reply →
Isn't something like regulation with strong data protection laws a bit late at this point? It seems fair to say that most people alive are already scooped up in 1 large data breach or another.
And that data has been made public likely in some form, and is probably replicated to dark corners of the planet.
Don't get me wrong, regulation on these industries seems like a no-brainer, but it seems unlikely to remediate the damage already done.
1 reply →
> Every scrap of data they have, accurate or not, can be used to hurt you.
What are some examples of inaccurate data, as in completely false data, being able to hurt me?
2 replies →
That has been my strategy for the last decade or so, Unless I have a solid reason to I never use my real name when placing orders and generally never the same fake name twice, always use a virtual credit card, if it's a non-physical product I don't even use my real address. I have some old phones I throw pre-paid sim cards into when I need to do number confirmation. The goal is to create a little consistent linkable data to me and at least generate some noise in all these data broker collection processes.
I do the same, I worry that eventually someone's going to need to see my driver's license and refuse me because my ancient account info doesn't match.
"It says here that this shipment is for Firstname Lastname at 1 Main St, Yourcity, born January 1st in the same year as you. Your license has a different address and different birth day and month, so you're not the same person."
In fact there are far fewer valid Socials. They follow a system where guessing a number of digits is fairly determined based on year and state of birth
This is not exactly true; the system _used_ to have a geographic component but SSNs issued since 2011 are random.
(Granted, most people here with an SSN should be older than that.)