Comment by diggan
2 days ago
> There are already “infinite” websites like these on the Internet.
Cool. And how much of the software driving these websites is FOSS and I can download and run it for my own (popular enough to be crawled more than daily by multiple scrapers) website?
Off the top of my head: https://everyuuid.com/
https://github.com/nolenroyalty/every-uuid
How is that infinite if the last one is always the same? Am I misunderstanding this? I assumed it is almost like an infinite scroll or something.
Here's another site that does something similar (iterating over bitcoin private keys rather than uuids), but has separate pages and would theoretically catch a crawler:
https://allprivatekeys.com/all-bitcoin-private-keys-list
1 reply →
Aren't those finite lists? How is a scraper (normal or LLM) supposed to "get stuck" on those?
even though 2^128 uuids is technically "finite", for all intents and purposes is infinite to a scraper.
1 reply →
Every not found pages that don’t return a 404 http header is basically an infinite trap.
It’s useless to do this though as all crawlers have a way to handle this. It’s very crawler 101.