Comment by fn-mote
9 days ago
Too little, too late. AI scrapers are better and better at acting human. AI scrapers already have a massive corpus; the marginal value of today’s need is low and will remain so long after access is cut off. When they manage to block archive.is too then I will believe they are at least a little serious.
I think people forget one thing - LLMs don't even need to scrape, we copy paste and put articles and documents right into their mouth, they only need to keep the mouth open. Copy pasted content is also preselected manually, might filter out some garbage as well.
A subscriber opens the FT, reads an article about semiconductor export controls, pastes it into Claude to ask "what does this mean for my portfolio?" - the FT's content just entered a model's reasoning process, got synthesized with other knowledge, and produced derivative value. No scraper was involved. The paywall was respected. The subscriber paid. And yet the publisher's content was "consumed" by an AI in exactly the way they're trying to prevent.