← Back to context

Comment by GolfPopper

2 days ago

>pinkie-promise-style obligations don't affect players too small or shadowy to bother litigating

I think you're looking at the wrong end of the spectrum there. It's some of the biggest players who flaunt the rules.

"Several AI companies said to be ignoring robots dot txt exclusion, scraping content without permission: report" (2024) https://www.tomshardware.com/tech-industry/artificial-intell...

Fair point. Being small and shadowy is a sufficient condition to avoid litigation, but not a necessary one. Another sufficient condition is having billions of dollars to throw around. Unfortunately, archive.org is well known, well loved, and fundamentally harmless.

  • > fundamentally harmless.

    This is going to go in a boring direction with an argument thread that's been made since Internet time immemorial, and before. The argument goes: Pirating articles off nyt.com leads to lost sales of subscriptions, so it's not harmless. The response is, inevitably, no it doesn't, it leads to more sales. Or, people who weren't going to pay weren't going to pay anyway, so might as well give it to them for free, and be happy (as the NYT) for free advertising. And then the follow up, "No, it's a lost sale and journalism needs the money." HN is for thoughtful and substantive discussion, not for rehashing the same boring argument we've all read a thousand times. So my question isn't which camp is right. Both camps are firm in their beliefs. Copyright infringement is fine, copyright infringement is not. My question is in today's AI-fueled digital hellscape, how do we support journalists and the arts? If journalism only exists because eg Jeff Bezos pays for the Washington Post, we're going to get biased reporting (which has existed since long before the Internet); If art only exists because the artists come from rich families or have patrons like the Renaissance era, is society better off?

But AI companies don’t publicly redistribute the content they scrape, whereas Internet Archive does.

Even if you believe what the AI companies are doing is or should be a copyright violation, the Internet Archive is redistributing in a more direct manner.