Inside the "3 billion people" national public data breach

9 months ago (troyhunt.com)

> While the specifics of the data breach remain unclear, the trove of data was put up for sale on the dark web for $3.5 million in April, the complaint reads.

I guess they failed to sell it because links to the leaked data on usdod.io have been available on Breachforum/Leakbase for over a week now. Someone created a magnet link yesterday and it's fully seeded so speeds are fast.

The data in the breach is irreversibly public now.

  • > Someone created a magnet link yesterday

    Are you against simply sharing the infohash here? I'd like to download the leak to see what information it has on myself and my family, but I don't really relish the idea of signing up for a breachforums account and sifting though its posts if I can avoid it.

    • Here is a strongly encrypted base64 version to keep hackers out:

      bWFnbmV0Oj94dD11cm46YnRpaDozY2FhNzFmM2VjOGNiY2NjNmZjYTRmZWI3MTg1ZGEyYmFiMTQ5YmE3JmRuPU5QRCZ0cj11ZHA6Ly90cmFja2VyLm9wZW5iaXR0b3JyZW50LmNvbTo4MCZ0cj11ZHA6Ly90cmFja2VyLm9wZW50cmFja3Iub3JnOjEzMzcvYW5ub3VuY2U=

      Allegedly, the password (also base64 encrypted) is:

      aHR0cHM6Ly91c2RvZC5pby8=

      42 replies →

    • BitTorrent uses something called a "distributed hash table", for which there exist services to search it (btdig, etc). You can use one of those alongside the torrent name (NPD) to find it.

      I haven't downloaded it, but my understanding is that the data comes compressed and with a (weak) password.

    • fyi that is likely to be a crime, at the very least has been cases of websites being punished for linking to illegally distributed IP (even if not hosting it).

      13 replies →

  • Nobody's gonna pay that much money for it when you can get it from ad companies for pennies

  • Now everyone just needs to send their email addresses to HIBP, i.e., email HIBP, so he can connect these identities with IP addresses and working email accounts. For peoples' protection of course.

    After everyone "has been pwned" then there is no need for HIBP. The answer is always "yes". Yet I am certain sites like "HIBP" will never go away. Something about email marketing.

    Some HN commenter(s) will inevitably try to defend HIBP. But this comment also refers to sites "like HIBP" that use data breach dumps opportunistically to generate web traffic, collect IP and email addresses. Some folks just do not see what is wrong with the idea.

It's worth remembering that the main reason this kind of data breach is a real problem is mostly due to the incompetence of the IRS. For any serious financial organization, knowing a person's SSN, name, address, etc doesn't allow you to access or withdraw that person's finances.

But the stupidity of the IRS means that people are easily targeted by false tax return attacks. File a fake tax return for someone, using their SSN/name/address, but tell the IRS you changed address. Then the IRS sends your tax refund to the new address, and boom, you just collected some poor sod's refund. To add insult to injury, the IRS is probably going to audit the person whose refund you stole.

  • But not just the IRS; the banking system, most healthcare providers, states for most of a century, and the credit bureaus for REusing SSN as unique identifier "passwords".

  • I agree. The IRS should be better funded so they can afford to update their systems and hire more tech experts.

    • I hope this is meant to be satirical. The IRS has a massive budget. Maybe just reallocate their current funds instead of giving them more is a better idea.

      3 replies →

  • This comment is shockingly misguided.

    The IRS doesn't have the authority to mandate the creation of a secure national ID system and enforce it's use by the financial system. Only congress has the ability to really do that. The IRS collects revenue.

    Even if it did have that authority, it doesn't have the budget to accomplish that goal.

    • isn't it funny how no government service is ever at fault, it's always just a problem of funding? The IRS is good, just under funded. Public schools are good, just under funded. The NHS is good, just under funded. The roads are good, just under funded

      except then funding is raised, and it's still a problem of funding. and inevitably, it's the evil side of the government (you know the one) that is to blame, even if there is no money to spend.

      how does a public service determine when they have enough funding?

      6 replies →

  • What you describe might be out of date. Someone tried to use my identity to file a fake tax return. The IRS caught it and now I get issued a PIN every tax season for kinda-sorta two factor auth.

Troy mentions "data opt-out services. Every person who used some sort of data opt-out service was not present."

Anyone have experience with these sort of services? A search brings up a lot of scammy looking results. But if services exist to reduce my profile id be interested.

  • > Anyone have experience with these sort of services?

    Quite a bit. Often if you request removal or opt-out, you'll reappear in a matter of a few months in their system, regardless of whether you use a professional service as a proxy or do it yourself. The data brokers usually go out of their way to be annoying about it and will claim they can't do anything about you showing up in their aggregated sources later on. They'll never tell you what these sources are. A lot of them will share data with each other, stuff that's not public. It's entirely hostile and should be illegal. I am trying to craft a lawsuit angle at the moment but they feel totally unassailable.

    I'm extremely skeptical of any services that claim they can guarantee 100% removal after any length of time of longer than 6 months. From my technical viewpoint and experience, it is very much an unsolved problem.

    • my understanding is that there's a bit of a catch-22 with data removal - if you request that a data broker remove ALL of your information, it's impossible for them to keep you from reappearing in their sources later on because that would require them to retain your information (so they can filter you out if you appear again).

      28 replies →

    • I've had a very bad experience with Liberty Mutual following a data opt-out from another service. They sent me on a runaround, ending with an email saying to follow "this link" to verify myself. (There was no link, only sketch.) I ended up getting a human on a phone through special means, and they sent me a fixed email with a working link.

      I should be hearing back from them in the next 32 days, as this was 13 days ago.

      1 reply →

    • It's hard to make collection, aggregation, and sharing of facts illegal.

      Not to minimize the harm that can be done by such collections, but the law is justifiably looking for a scalpel treatment here to address the specific problem without putting the quest to understand reality on the wrong side of the line.

      7 replies →

    • this is true and nothing new.. mass "gray market" personal information services lept into markets since VISA and Mastercard fifty years ago, and somewhat before that with driving records, in the USA. The "pure land" of democracy in North America was never pure, and the Bad Old Ways have crept into the corners since the beginning.

      2 replies →

  • Consumer Reports just published (as in last week) a report[1] surveying a number of these services and found almost all of them to be a little bit effective, none of them to be highly effective, and the cheapest of the lot to be the most effective (EasyOptOuts).

    Of note, opting out of a service by yourself by hand was only 70% effective ($0). Using EasyOptOuts was around 65% effective ($20) and using Confidently was only 6% effective ($120).

    [1] https://innovation.consumerreports.org/wp-content/uploads/20...

  • A lot of the data opt-out services are operated by or have the same owners as data brokers. So at the very least they are selling both the poison and the cure.

  • If you're willing to tempt fait, the best way to 'opt-out' is to tell people, when they call asking to speak to 'your name', that 'your name' sadly passed away recently.

    • I knew someone falsely declared dead (probably a paperwork mixed up around pensions when his ex-spouse died). Without warning, he lost all of his pensions, social security, medicare, etc, along with most financial institutions freezing accounts and canceling credit cards. Many long phone calls, letters, and lawyers eventually resolve most, but that never fully purged the public and private death records so there would be random issue for the rest of his life (failing fraud checks, brief interruptions to pensions, trouble with the cable company).

      3 replies →

    • I prefer to just never answer a phone call unless I know who is calling and it's someone I know personally and want to speak to. Even then, those people know I'd rather they text anyway so when they do call it's more likely to be really important.

  • I have used (free trials) and currently use (discounted annual) a service called incogni. It's hard to really verify what's going on, but they at least show the brokers they are contacting on your behalf, and I've directly received confirmations from some.

    Anecdotally, searching my name on Google pretty much no longer returns those scummy "People Finder" pages that just scrap any public records they can find.

    That said, I hope incogni is happy enough with my money that they themselves don't do anything scummy.

    Also, freeze your credit at the big three. do it now.

  • In the past I have just searched for my own name. And when I found a match, I would go to that site and request to be removed. It is a lot of work, but thus far it has been successful.

    And I say this, because I was on a TV show years ago, so my real name is all over the internet from an entertainment point of view. But, if you search my real name, there are little to none pointing back to "public record" websites and the such.

  • Many seem scammy, and I went through the search before and gave up.

    Then, as fate would have it, a HNer(tjames7000) mentioned he made EasyOptOuts for this reason, so I signed up. Cheap, seems effective, absolutely no complaints.

  • Since it is Troy I assume it is legit, and I haven't read the link yet. But... How does he know that?

    Has the opt-out services leaked as well? Or is noone using them? How would we know?

It is crazy to me that data brokers are even a legal form of business. All of these services should be opt in at minimum. If they are obtaining publicly available information and making it easier to access, they should have to maintain insurance or a deposit with the government to compensate victims of cybersecurity incidents. Telling people to get credit monitoring is in NO WAY an acceptable way to make us whole. They need to pay for a lifetime of monitoring and INSURANCE up to the net worth of affected individuals. This needs to become law ASAP.

"there were no email addresses in the social security number files. If you find yourself in this data breach via HIBP, there's no evidence your SSN was leaked, and if you're in the same boat as me, the data next to your record may not even be correct. "

Seems like Troy is skeptical about this being a real full breach?

  • A lot of these data brokers hold wildly inaccurate information.

    • You too can be a data broker!

          for (i = 0; i < 900000000; i++)
              insert(first: random_firstname(), last: random_lastname(), ssn: i);
      

      Does anyone really really care if the name is accurate if the SSN is present? More than half of the SSNs in the above dataset are valid.

      19 replies →

    • Yes, but they can also be pretty accurate.

      While I have never dealt with one of the paid services someone ran one on me as an example of what is out there (nothing malicious about it) and just about everything on it was accurate or close to it. Only one thing on it wasn't at least pretty close to the truth--it had me living in a state I've never set foot in. And quite a few other people seemed to have the same address at one point or another.

  • I'm in the UK so I have no Social Security Number, and I still got the HIBP e-mail.

    When I looked into it, it turns out the "original" breach is comprised of files named ssn.txt and ssn2.txt which only contains Americans details, and doesn't contain any e-mail addresses.

    It seems what happened is there was one leak of US SSNs which the leakers attributed to NPD, then some people bundled that leak up with a bunch of other data (including e-mail addresses and details of non-americans) and who knows if the latter data actually came from NPD?

  • >the data next to your record may not even be correct. "

    American Express by way of Experian alerted me to my SSN having been leaked precisely by this incident.

    The number was seemingly correct, but everything else associated with it such as name and address were nonsense.

    So assuming we're talking about the same thing... can confirm?

  • I don't think it's a "full" breach because I assume that would include many tera/petabytes of original source documents rather than just a CSV of PII, but it's definitely a real breach.

    I looked up several family members and although most of the phone numbers and addresses were out of date, they were accurate as were the listed social security numbers. However, it didn't include any of the more recent immigrants in the family or myself, possibly because I take opsec seriously.

    Funny enough it looks like it has data for Tom Brady, former FBI director James Comey, Barack Obama, and Donald Trump (just some of the names that popped into my mind to look up).

For years I've said the entire SSN database just needs to be published alongside legislation strictly assigning liability to any company who defrauded as a result of using the SSN as a "secret". That would fix the problem with SSN's and "identity theft" quickly.

Part 1 has been accomplished. Let's get part 2 going!

Aside: It amazes me how the American public has allowed defrauded companies to assign the company's loss as a liability to innocent individuals (in the form of "identity theft"). It would be great if we could get that changed in the minds of the public. A well-informed public could collectively turn "identity theft" into the "bank's problem" (from the old adage "If you owe the bank a billion dollars they have a problem..."). The insurance industry would swoop in as the defrauded parties start making claims and shoddy security practices would get tightened-up.

(Edit: I fear insurance companies coming in to "fix this" to some extent-- citing my experiences with PCI DSS compliance auditing and Customers who have had 'cyber insurance' policies coming with ridiculous security theatre requirements. Maybe we can end up with something like a 'cyber' Underwriters Labs in the end.)

(Also: Yikes! I hate that I just typed 'cyber' un-ironically.)

  • Identity theft is a very clever term to shift blame from the company to the consumer.

    https://youtu.be/CS9ptA3Ya9E

    It’s a comedy bit but I take its point seriously: if the bank gives away money, it’s the bank’s job to make sure it is repaid. Not mine, unless I was actually a party to the agreement.

    • Well then you're up against the wall of digital verification.

      I know there's a fuck load of situations where the banks are 100% screwing the customer to their benefit, but there's a legit conversation about people who give out their passwords, or claim they did, when money gets wiped out.

      If you meet all the requirements to identify yourself to the bank, at what point does the bank have to say "this is that person, and that transaction is legal".

      Now granted:

      1. With passkeys and biometrics and 2FA we've got a lot of better ways to make these accounts secure, and hopefully more idiot proof. I'm hoping we start getting rid of email/phone for 2FA as a valid option though.

      2. The moment the police are treating it as an identity theft case, the bank should be required to pony up. I don't know if that's the case (and wouldn't be surprised if they fight it tooth and nail), but at that point you have a state or federal entity acknowledging this is not a legit transaction, and therefore you should be compensated by the bank, and they can get their money back from the insurance companies that insure against this kind of thing.

      17 replies →

  • It's not even necessary to publish the database. Pass a law, or even possibly a regulation or court instruction, that SSN is not a sufficient basis to establish identity, and that any unauthorised financial transaction, legal document, commercial transaction, or other use relying on SSN is considered prima facie uninsurable fraud.

    Use would likely diminish markedly.

  • Ever since the Equifax breach I’ve been a proponent of a new national ID program to replace the SSN, that can be designed for what the SSN has become and tolerant to these never ending data breaches.

    Maybe this will give a second chance at a conversation around that, but I’m not too hopeful.

  • US law does generally make fraud the bank's problem. Identity theft isn't loophole in this, it is a situation in which there is a logical ambiguity in differentiating one fraud from another. If they just believed everyone who said "it wasn't me that spent that money!" that would just be opening another vulnerability.

    • I think we've got liability pretty well buttoned-up in the banking industry. I'm more concerned about the non-bank businesses. (I recently obtained utilities at a new house. All three utilities-- electrical, gas, and water/sewer-- use my SSN as an authenticator for my account. In 2024.)

      5 replies →

For non-Americans (and Americans) that don't quite understand what SSN is and why it's a problem, CGP Grey [1] has a great (and short) video about the history and why it's not technically an identifier, but has become one.

[1] https://www.youtube.com/watch?v=Erp8IAUouus

  • It's so interesting how Australia went the other way and actually banned the use of any government-issued ID number as a primary identifier by any organisation other than the government department which issued that ID number.

    In the 80s, the very popular Aussie prime minister, Bob Hawke wanted to introduce a National ID card, complete with a unique number, that would then be used for everything from Medicare to tax filing. The government however did not have the numbers to pass it through the Senate. Hawke called a double dissolution (dissolving both lower and upper houses of parliament) over the issue. He was returned to power after the election but still without a majority to get the bill through.

    There were then attempts to use "other" government issued ID cards like the Medicare number, for this purpose. To prevent this, a few years later, a bill was passed that would prevent any such use.

    In reality, this means businesses can ask for government issued numbers but it has to be optional and voluntary, and never used as a primary ID. When I go to my doctor for example, I can provide them with my medicare number, in which case they will claim the Medicare rebate on my behalf automatically, or I can refuse to provide them this number, pay the doctor's fee in full, and claim the rebate from medicare myself separately. Similarly I can provide my bank with my tax file number, in which case they will automatically tax my interests earned according to my income band. Or I can not provide them my tax file number, in which case they'll tax my interest rate at the highest income band, and I can then get the money back from the tax office when I file my tax returns at the end of the year.

    In Australia we don't have a Bill of Rights. We don't even have a right to freedom of speech. The police can ask us to unlock our phones without a warrant; etc etc. Yet when it comes to privacy, our laws are very clear. For a country with such a history of protecting individual liberties, it always amazes me that the United States takes such a laissez faire approach to privacy.

  • The video doesn't quite get into the problem of identity theft, which is when someone uses your stolen creds to claim they are you, and then go on a shopping spree which may include buying a car under your name. You shouldn't be liable for debts incurred after having your identity stolen but proving that is a lot of work.

    • > You shouldn't be liable for debts incurred after having your identity stolen but proving that is a lot of work.

      The first step is to call it what it is: fraud by misrepresentation. The owner wasn't deprived access to their identity (a key component of theft), they weren't even involved in the transaction. Companies want to have their cake and eat it - have low barriers to making sales/offering loans without rigorously verifying the identity of the person benefiting and be shielded from losses when their low-friction on-boarding fails lets in fraudsters.

      If a home buyer is duped into transferring deposit into a fraudsters account, they don't blame it on corporate "identity theft" and put the escrow agent on the hook by default.

    • "Identity theft" is just fraud, rephrased to make us the victims instead of the defrauded companies.

      That's why SSNs are still such a big deal. Why fix the problem when you can just make it someone else's problem?

      1 reply →

    • In many other places SSNs are non-sensitive data. There is not much one can do just knowing a SSN. Usually one has to do some kind of verification (eg using some sort of authentication app, if online). Which is why it is so confusing.

> The problem with verifying breaches sourced from data aggregators is that nobody willingly - knowingly - provides their data to them

This is a bit of a tangent but I feel like if we can prove this statement then these data aggregators should be made illegal. How can you consent to something that you don’t know you’re consenting to? Likewise why do these entities have the right to collect detailed personal information like SSN without your explicit, beyond reasonable doubt, consent? To me this is the most obvious failure of the legal system, it clearly goes against well established legal principles that a basic requirement of an agreement is that all parties know what they are agreeing to.

Obviously there is some leeway with agreements where it’s not possible to clarify every eventuality but lets say if you’re applying to rent a place through an online form and that form shares your SSN to a data aggregator, it should be extremely clear about that, and possible to out out while still allowing you to complete the rental application without discrimination.

It’s like, it should be possible to show that no one, with in reason, consented to sharing their data with this aggregator because no one is able to confirm that they did. Sure one person could forget, or lie, but 100s of millions of people? No. Clearly almost zero people knowingly consents.

  • I have been using a different site@mydomain email address for every service I've used for the past 15 years. I can point to exactly which site breach furnished my email address to the aggregators.

    • Care to call out some bad actors so others know to avoid business with them?

      I recently started using unique emails for everything I sign up for. Thankfully I haven’t seen anything yet, but I have little hope it will stay that way.

      4 replies →

    • I like email forwarding services, like ddg, mozilla’s relay, iCloud’s hide my email and simple login. Unique password and email address for every website, plus, like you said if your unique email shows up somewhere it’s a smoking gun.

I was wondering why Google suddenly turned on "prompt authentication" on zero-security feature accounts yesterday. Now I "must" have a phone nearby to use Gmail... Tap to authenticate every time you want to look at ... ad spam.

With this, Ticketmaster, and the CDK Global car theft, is there anybody on Earth who doesn't need data protection? Poor people in Somalia need data breach notices. People who are not even on the WWW need data breach notices...

I recently hired the experts of {hacker11tech (@) gmail com} to help me track my spouse's GPS location, as I suspected infidelity. They provided me with accurate and timely information, revealing that my spouse was frequently visiting another person's location instead of going to work as claimed. Their expertise and professionalism were very impressive, and their ethical approach ensured a discreet and confidential process. The evidence gathered was comprehensible and reliable, giving me clarity that I needed to address the situation. I highly appreciate the {hacker11tech (@) gmail com} dedication helping to uncover the truth while maintaining ethical standards, their services was valuable in helping me make decisions about my relationship. I highly recommend this team {hacker11tech (@) gmail com} for anyone seeking reliable ethical practices and their commitment is reassuring.

Anything the average SSN holder should be doing proactively?

  • You could freeze your credit, it you wanted to be careful. Realistically though, you should have already been monitoring to check if unexpected things were being done in your name. I’ve presumed that all our SSNs have been out there for years now due to one hack or another, that this hack just makes it indisputable doesn’t change much.

Why are data aggregators legal? In California can we create a proposition to shut them down in the state?

This sort of stuff will continue happening until the regulatory framework acknowledges a fundamental consumer right to privacy.

If a data broker collects data without the consent of the consumer, then their only real risk is a class action lawsuit which drags on for six years, gets settled for a few days profit, and the consumer gets $13.50 after the legal fees. This massive skew in the risk reward calculus of data brokers is why we have the problem. Because there's little to no real downside, the trend is automatically collect as much data on as many people as possible.

Fixing this means big, mandatory, cash penalties in the law code - say $5k per consumer data leak, directly to the affected consumer, with added penalties if the company lies about the leak or delays payment. The fine must be big, mandatory, and paid directly to the consumer. Only that changes the risk reward ratio.

In that new world, companies would have to re assess their risks. They'd either build invulnerable systems and hire a lot more people reading HN to protect their golden goose, or better still they'd decide to exit the business entirely. That sounds bad, but the only reason the industry exists is because regulators failed to foresee massive leaks like this happening every three months.

We need a consumer data privacy law, with massive fines, to force companies to change their behavior. What we're doing now clearly does not work.

  • They should tax companies so that operating data centers become more expensive. Increase price of electricity or property tax. That will inherently force companies to collect and store less data, hence less damage from breaches.

I used Robokiller to remove myself from data broker lists. I'm extremely impressed with it. I pay yearly. My only annoyance with Robokiller is that

A) It's necessary. When is the government going to start creating laws to help us and prosecute this?

B) It's expensive. Most people cannot afford this. I can barely afford it but my information has been leaked online.

C) It's inconvenient. A majority of calls are spam, but I'll often miss important calls from unknown numbers because Robokiller acts as a proxy and for some reason the call is routed through the Internet.

Anyhow, my wife and I are not on this list. I'm wondering if using Robokiller saved us from a lot of pain here.

Even before this, anyone operating a service who isn't treating SSNs as public knowledge in 2024 needs to be, well, shamed or penalized or something.

I’ve finally figured out the play: war of attrition.

Eventually enough data will be leaked to make moot the benefits of securing any personal data. At that point everyone stops trying and moves on to more financially rewarding activities.

I mean even if I’m an elephant, and data breaches are blind men, eventually enough blind men will draw a true comprehensive picture.

Several other commenters have brought about the sneaky wordplay involved in saying "identity theft" instead of simply calling it "fraud on the bank", and somehow turning the person into the victim rather than the bank that has been defrauded.

Has anyone tried to argue this point in court? Has this survived / how did this terminology shift survive judicial scrutiny?

From the NPD website:

> Please be advised that we will not collect, use, disclose, sell, or share the sensitive personal information or sensitive data of California, Virginia, Colorado, or Connecticut residents as those terms are defined by the CCPA/CPRA, VCDPA, CPA, or CTDPA, respectively.

Does anyone else just not give a fuck at this point about their SSN? I feel like maybe early 00s this would be scary but it's clear that everyone's SSN is out there already or waiting to get breached from a shady private data broker.

The problem lies in how institutions treat the SSN, not the number itself.

  • Yes. 99% of the time “identity theft” means a huge company cut corners on their security policies and wants us to subsidize their negligence. Every so often there are cases like that guy who pretended to be his former coworker for decades but they’re rare enough that they make the news internationally. Most of the time it used to be things like instant credit applications where they didn’t “slow” purchases with ID checks.

    The good news is that companies have lost the presumption of competence there. In the 80s if a company said they’d confirmed that an applicant was you using your SSN, a lot of people would falsely believe that was sufficient but by now they’re not going to get far if they sue you unless they can provide better evidence because everyone knows huge breaches have happened many times.

    • Not good news. Doesn't matter if the business is presumed competent. What matters is that the business can steal your assets to pay for their losses.

      1 reply →

  • if you know place of birth, and place of ssn application, you can determine most of the ssn. the final 4 are supposed to be random, but are blurted out to rooms full of people and tech, during service.

    the integrity of SSN security, was lost a long time ago

Are there any ways to check the breach to see if my information is there, other than downloading it myself? I’m not sure of the legality of doing so.

Time for services everywhere to stop using SSNs for identification and for the US to move on to a more advanced form of identification.

And lock your credit.

  • What can an attacker who knows your SSN still do with that information nowadays? Genuinely curious, as the SSN is just this strange in distinct password thingy the Europeans like me hear about on HN but have no actual parallels with.

    • If they have your address; birthday; and SSN a whole lot. Generally, they could apply for credit cards; loans; set something to bill to you; etc...

      Fortunately, it's getting harder without previous addresses or other verification methods.

      For non-Americans that don't know, our Social Security number is generally assigned at birth or when you become a citizen by the Social Security Administration. Social Security is a disabled or elderly benefit we all pay into (roughly 7.5% employee and 7.5% employer - ~15% total). It's the only number we all get, since not everyone gets a driver's license; ID; passport; or other identifier. Unfortunately, it's been used to identify us for everything, and until recently was typically in plaintext on most forms (medical; tax; student; etc...).

      CGP Grey has a good summary of how it came about and why it's become a problem: https://www.youtube.com/watch?v=Erp8IAUouus

      8 replies →

    • The SSN is used as a way to genuinely identify someone, unfortunately - it’s like having to give out your password each time you rent an apartment or buy a car or obtain medical care or any number of other transactions. Having this info (along with other basic info like name/address/date of birth) lets you effectively pretend you are them. You can take loans out in their name or call some service to do a password reset (since you have all the info to verify you are them) or whatever else. But it’s not like there is one particular way in which the information can be used - it’s dependent on what businesses LET you do with that info. In 2024, NO business should use SSN to verify identity or authorize sensitive transactions but many do, and what they let you do varies significantly.

      1 reply →

Can't the SSA just issue 330 million new social security numbers, and tell people to be more careful with them from this point forward?

  • The SSA specifically told people not to misuse SSNs this way and it seems like a poor use of taxpayer funding to spend billions bailing out businesses’ bad decisions, even if that was legal (Congress would have to specifically authorize it), since we’d be back to the same problem with five years.

    If we were going to do something, we’d make government ID include an NFC token for PKI purposes since public keys can’t be compromised in the same way, but nobody is jumping to pay for that, especially in a country where you have so many people prone to wild conspiracy theories (I am especially amazed by the guys who freak about a national ID as big brother but never say a word about the credit reporting industry) and the enduring “Mark of The Beast” religious fears.

    • > If we were going to do something, we’d make government ID include an NFC token for PKI purposes

      Another alternative would be to go the other way: Pass a law prohibiting the use of social security numbers for any purpose other than social security. Don't provide any globally unique identifier for companies to use.

      Instead each institution would issue their own identifier which would have no value outside of that institution. If they get breached or you lose your ID, they mail a new one to the address they have on file or some similar recovery method and you don't have to worry about someone using your ID somewhere else because the breached one gets disabled and you get a replacement.

      The obvious advantage here is that companies can't use it to correlate your activity across institutions without your knowledge or consent.

    • > If we were going to do something, we’d make government ID include an NFC token for PKI purposes since public keys can’t be compromised in the same way, but nobody is jumping to pay for that, especially in a country where you have so many people prone to wild conspiracy theories (I am especially amazed by the guys who freak about a national ID as big brother but never say a word about the credit reporting industry) and the enduring “Mark of The Beast” religious fears.

      Login.gov gets us pretty far until NFC can get baked into credentials. Would love to see passport cards evolve into this [2], but again, lots of work and political will to make that happen. In the meantime, remote and in person proofing to bind IRL gov credentials to digital identity must do.

      (As of December 31, 2023, over 111 million people have signed up to use Login.gov to date, with over 324 million sign-ins in 2023; this is ~1/3rd US population; no affiliation)

      [1] https://login.gov/

      [2] https://travel.state.gov/content/travel/en/passports/need-pa...

      11 replies →

    • Painting those of us concerned with privacy as "people prone to wild conspiracy theories" is a very bad faith take.

      Please do not give the government any more power over me than they already have, thanks.

      1 reply →

  • The SSA has shown absolutely no urgency on this issue. Their existing policy is that having your SSN compromised is not enough to issue a new number. You have to actually be a victim of a financial or identity crime that abused your SSN for them to consider a new number. In reality what they should be doing is giving everyone accounts that can generate tokens for use with each transaction, to maintain a trail of where leaks originate and also to expire these temporary tokens. Instead they’ve stuck to this archaic system.

    • They can't issue new numbers in bulk without revamping the system because they'd run out. The urgent fix wouldn't work.

      If the system needs to be revamped, then step one should be pressure/force so that companies stop treating the numbers as secret. And if we do that we don't need new numbers anymore.

What if we just made all this data free , some AI is going to compile them anyway (and probably already has). Deterrence is the best defense, right ?

  • It depends on the country. Where I live now even if I leak my name, date of birth, bank details, national id number, etc. you couldn't do much. We have a country wide 2FA system that all important businesses use (bank, utilities, health, government) to authenticate users.

    I'm from the UK though, and previously was a 'victim' of identify theft where a few years ago someone walked into a phone store, and walked out with a new iPhone and contract in my name.

    • Is the country wide 2FA implemented by the country or a private company? While rare, what if a person does not have access to the 2FA mechanism, and what mechanisms are permitted to confirm an identity?

“The database DOES NOT contain information from individuals who use data opt-out services. Every person who used some sort of data opt-out service was not present.”

Like what?

And where is this information that this random group supposedly has? I have yet to see proof of that being real

  • It's real. A few people I know are in the dataset. The SSN is problematic, but personally to me, the more troubling data is a seemingly complete, or at least complete enough, address history for the people I checked for. It doesn't have dates, but just having the addresses could cause major problems for spear phishing attempts.

  • I was able to get a hand on it, and I was able to confirm that some records of loved ones are indeed present (although mine was not.)

the government should have put out honey pots or something, or maybe it’s time to get new numbers and just invalidate all the stolen data, there is clearly money for fixing this kind of thing but they’re using it to spy on us and do who knows what else instead

I worked incident response for years, logging thousands of hours of actual on site work with impacted clients.

No on cares.

Clients see this as the cost of doing business and have no incentive to do better. Even after Equifax and OPM.

Until we have a GDPR style law in the U.S. it will continue to be status quo.

I sure wish the US had a version of GDPR.

I get a data breach notice at least a few times a year. I got one for my kids two months ago for their medical data. I thought HIPPA had huge penalties but I guess not.

Perhaps HN readers would appreciate a detailed account of what the NPD torrents contain.

The torrent deliver two files like so:

  NPD202401.7z  33,456,912,010 bytes (32GB)
  NPD202402.7z  20,548,499,322 bytes (20GB)

Uncompressing NPD202401.7z results in:

  ssn.txt 176,806,109,779 bytes (165GB)
  wc -l ssn.txt ==>> 1,698,302,005 lines

Uncompressing NPD202402.7z results in:

  ssn2.txt 120,722,361,611 bytes (113GB)
  wc -l ssn2.txt ==>> 997,379,508 lines

This is a total of 1698302005+997379508 = 2,695,681,513 lines.

Each line is a comma separated record with these fields:

ID,firstname,lastname,middlename,name_suff,dob,address,city,county_name,st,zip,phone1,aka1fullname,aka2fullname,aka3fullname,StartDat,alt1DOB,alt2DOB,alt3DOB,ssn

Generally records have ID, firstname, lastname, middlename, address, city, county_name, st, zip, and ssn. Most records do not have the fields for name_suff (name suffix), phone1, aka1fullname, aka2fullname, aka3fullname, StartDat, alt1DOB, alt2DOB, and alt3DOB.

There are no emails at all. There is no "@" in the files anywhere. Phone numbers are very rare.

I don't know what the ID number at the head of each line represents. I presume it is an internal index used by the organization that compiled the data. The SSN is at the end of each line.

The files have U.S. addresses only as far as I can tell. Nothing from Mexico, Canada, or other foreign countries.

Many of the lines (records) concern the same person at various addresses. Of 7 random people who I personally know that I checked on, all had entries. There were between 3 and 20 lines (records) for these 7 persons, averaging about 10. They usually differed only in the address field. Going by an estimate of 10 records per person, the 2.6 billion lines represents about 2695681513/10 = 269,568,151 distinct persons in the U.S.

The U.S. population is about 337M where 78% is over 18 years of age. In other words, 337000000*0.78 = 262,860,000 Americans are adults. This is pretty close to my estimate of 269,568,151 distinct individuals in the NPD data files.

Of the 7 persons I checked on, the names were spelled correctly, although the middle name was sometimes just an initial. I searched each person by multiple methods (address, last name, birth date) so I believe I would have detected names that were spelled slightly wrong.

The addresses appeared correct but there was no way to tell which was the current address and the order in which they lived at each address. There is a StartDat field but it was almost never filled in. The latest entry was not always the most current address. In a couple cases, the current address, where the person has been living for several years, was absent.

The birth dates were correct in a couple cases, were abbreviated in three cases (that is, instead of showing 19800704, meaning July 4 1980, it showed 19800700, meaning July 1980 without an exact day), and was wrong for one person by a wide margin.

All 7 persons I checked had SSN numbers. It was correct for 1 person but I don't know for the other 6. The SSN numbers were consistent for each of the 7 persons I checked on. By this I mean that a person did not have more than 1 SSN number, at least among the 7 persons I checked on.

Ahh, cool, pour the corpus through GPTs and start tweeting Congressional rep personal info at them until they pass a law to outlaw data brokers (in keeping with historical precedent [1] [2]).

[1] https://en.wikipedia.org/wiki/Video_Privacy_Protection_Act

[2] https://jolt.law.harvard.edu/digest/dodging-the-thought-poli...

  • For argument sake, instead of outlawing data brokers wouldn’t it be better to design a better ID system that renders one’s name, dob, and SSN as harmless information?

    I don’t know what that would look like but if I had congresses attention I’d like them to fix the problem rather than playing whack-a-mole with banning data sources. I don’t think any actual solutions come from that.

    • In many countries in Europe, your ID card contains a chip with a cryptographic key, much like chip&pin on a debit or credit card.

      Those bits of information are worthless when you need to create a cryptographic signature with your ID card to do almost anything important.

      If the card is lost or stolen they can just remove your old one from the keyserver. It's literally just public key crypto.

      Identity theft is rampant in the countries that don't have such a system and basically require you give them increasing amounts of private information to prove who you are. In the UK that's every address you've lived in for 5 years, your council tax bill, your energy bill, your bank statement for a month... all because British people think an ID card means you'll get stopped on the street to show your papers.

      63 replies →

    • Funny you should say that. Australia is trying to launch TEx -designed on open-source models to do this kind of thing. It's hitting the usual roadblocks of public acceptance of government mandated ID, in an economy which trashed the "australia card" idea back in the 80s. We're wiser now, we've been frogs boiled slowly: the downsides of central safe ID/auth are outweighed by the risks of loss of info giving everyone 100 points information.

      The government now knows what we do most of the time anyway: layer-2 logs on our phones are constant. We lost any privacy some time ago. So now, getting security back might be a net win.

      https://www.abc.net.au/news/2024-08-13/trust-exchange-digita...

      6 replies →

    • We should be doing both, for different reasons. Ban data brokers because they allow anyone with a credit card to stalk people, more or less legally. Fix the SSN identity system because even if you ban data broker businesses, dark web brokers don't abide by the laws anyways.

    • I’d replace “instead of” with “in addition to”.

      Going after data brokers seems like low hanging fruit, and necessary even if the ID system needs to be replaced. This is a top level issue that need to be addressed regardless.

      While I think it’d be great to design a system where the information you mention is harmless (I’m curious how this would work without just shifting the problem to whatever new identifier is established), the reality is that this information is not harmless, and will continue to be dangerous to leak for the foreseeable future due to the myriad of systems that use this data in its current form. Any theoretical project to replace this would likely be a long and drawn out undertaking. Addressing the information environment in the meantime seems like a good idea.

    • > I’d like them to fix the problem rather than playing whack-a-mole with banning data sources

      We should fix the problem and ban the data-sources. Whack-a-mole makes it sound like we're talking about a ban on one company, but what clearly needs to be done is a categorical ban on super sketchy business practices, and that seems simple enough. Data-brokers, if they are going to exist at all, need to accept the burden of proof to establish that every single row involves consent, and they need to acquire new consent for every single resale of the information. If that makes the whole industry unprofitable, too fucking bad. And if this looks bad for business, it gets even worse: good luck getting consent for reselling what is mine without offering me a cut.

      Since the above kind of common sense looks crazy these days, let's throw in something even more radical. For anyone looking to fund UBI, ^ here's a start. The trouble with the often-mentioned idea of "tax the data" as a solution for privacy concerns is that these taxes are just redistributing wealth from corporations to governments, while all of profit is made with our information. Who wants the monetized details of their personal life to pay for the next unjust war, or even the roads in some place they don't live. If we are so valuable, put some of that money back in our hands, and if the price doesn't sound fair to us, then let us opt out of the sale.

    • The uneven availability of information means that no, it's not better to just design a better ID system. Data brokers give corporations far more advantages than a normal person could ever protect themselves against, because even if the data broke doesn't have your government issued credentials they can still easily designate who you are buy collating all the data from other means such as purchasing habits, cellular, and service guest lists.

    • It's politically a non-starter in the US. US states have a lot of power that is derived from their ability to maintain their own ID systems. The states have fought for almost 20 years on requirements as simple as REAL ID.

    • Plenty of countries have smart cards with chips and RSA keys that can be used to verify ID with much higher level of certainty, but then they usually don't use it.

      Even just name, DOD and last 4 of the SS number and you are done.

      It's ridiculous.

    • https://news.ycombinator.com/item?id=40961834

      TLDR Login.gov, and publishing a circular to allow businesses to use it to identity proof. Push all liability onto the business for losses if this method is not used to identity proof. ID card as ljm mentions, such as a passport card. Very similar to credit card EMV chips and the liability shift from magstripe.

      > I don’t know what that would look like but if I had congresses attention I’d like them to fix the problem rather than playing whack-a-mole with banning data sources. I don’t think any actual solutions come from that.

      Aggregating data means it can be lost. You must therefore make aggregating and storing data toxic, and impossible to be leaked through eventual mismanagement.

  • We detached this subthread from https://news.ycombinator.com/item?id=41249125.

    • I thought it was a legitimate proposal to the problem at hand, but respect and understand the decision. My apologies for taking the conversation potentially off topic.

      https://paulgraham.com/founders.html

      > Though the most successful founders are usually good people, they tend to have a piratical gleam in their eye. They're not Goody Two-Shoes type good. Morally, they care about getting the big questions right, but not about observing proprieties. That's why I'd use the word naughty rather than evil. They delight in breaking rules, but not rules that matter. This quality may be redundant though; it may be implied by imagination.

      While scoped to founders, I think it broadly applies to a subset of curious people who are wired to solve problems, imho.

  • Err, why do you need a GPT for this stunt? For a quarter of the price of a 2010s mid-range HP laptop, I have a Python script for you.

I am just dreading the day when a near simultaneous cyberattack on a high number of(more vulnerable like middle-lower income individuals) start in a DDoS fashion:

1. Credit histories will be(unlocked) used to file multiple credit applications and tax credits will be applied for.

2. Multiple Cell phones will be hijacked through Sim Hijacking or other zeroday attacks to make it very difficult to get back in.

3. A person's profile will be used to attack the most vulnerable things: - Their families will get fake calls to create confusion. - Their financial services will be frozen or worst weak 2fac auth ones will be compromised.

4. Deep fake image and videos will be created from compromised accounts to sow further mayhem.

This already happens in targeted and one startegy of teh other fashion. Imagine what one could do with a bit more compute and completed profiles and orchestrate this kind of terrible vengeance.

  • I am wondering what the numbers are like for this to be realistic.

    I am not too sure of the end goal other than general chaos. Let’s say it’s 2 days of an attack, (that’s about how long any co-ordinated response would need at minimum).

    So attackers need to sow chaos across the USA. They apply for a million unsecured loans of say 20k each. That’s 20 billion.

    I honestly don’t know what the daily personal loan application rate is, but america has about 150M adults, 1% of them applying on the same day will not only raise flags but would basically grind the system to a halt - each loan office would have daily maximums and a massive spike coukd not be handled. And once the massive crowd is noticed and made public then the financial immune system comes into play.

    I can imagine taking out the cell network through a sort of SS7 ddos, but I suspect that cell towers might have a dose more vulnerabilities (probably not as basic as all the admin passwords are ComC4astSux but close)

    In general Chaos seems to come from attacking the limited services that act as our safety net (ambulance, police, sewage, electricity). We know these are vulnerable in non obvious ways - crowdstrike for example.

    Making otherwise fit and healthy citizens have a shitty day is less impactful than we might think - it will be the “blip” day - as I say 48 hours later the Treasury secretary goes on TV and announces all personal loans that day got cancelled or some other fix - finance has a fairly good immune system when it sees the need.

    But overall, if we are going to worry about some attacks, let’s look at the ones that attack our freshwater supplies - and that might not mean some terrorist - in the UK our sewage handling has been under attack by Private Equity for decades and SWAT teams are not allowed to shoot people in Belgravia

    • You’d need to pick a day of importance to launch the chaos-sowing attack against information and social services. I’m sure there’s a useful one in early November.

      2 replies →

  • In the US, the government could help alot if they simply moved to a national ID system and dismantled social security numbers.

    The national ID systems I've seen proposed have alot more security from the ground up, and could replace the passport system.

    • The US has done itself a disservice with their actions because few people trust the government. A national ID system means a database of all Americans that would very likely be used for surveillance and monitoring. I'm saying this as someone who has Global Entry so it's not like I'm afraid of being in a US database but I see the concerns.

      4 replies →

    • The US doesn't need a national ID. It needs a national PKI.

      The US Postal Service is in a great position to be the one who executes it. They have access to delivery physical goods to the entire country. They have the staff and procedures to do identity verification for their current products that could be extended to a PKI offering.

      It'll never fly, politically.

      2 replies →

    • "Wow, the government is so catastrophically bad at managing IDs; what should we do?"

      "Hmmm. I know! Lets get the government to manage a mandatory ID system, and require it for all aspects of citizen's lives! In fact, lets centralize all of their medical, financial and personal data using this ID, and ensure that it can all be accessed using this ID! What could possibly go wrong?"

      1 reply →

  • I wonder how many governments have this capability right now? I would guess at least three.

    • As far as I know, most of the developed and in development countries have this kind of database, I also know some poor countries does too, but they often lack security measures

  • SSNs can be used to disconnect utility service, too. Doing some amount of that would surely add to the "fog of war". It often takes phone calls but the tools have been created to automate that on a massive scale.

  • Luckily, there aren't multiple hostile nation states capable of this. /s

    All that I can see preventing it is deniability and eco-political risk.