Comment by diggan

1 year ago

> There are already “infinite” websites like these on the Internet.

Cool. And how much of the software driving these websites is FOSS and I can download and run it for my own (popular enough to be crawled more than daily by multiple scrapers) website?

8 comments

diggan

gruez 1 year ago

Off the top of my head: https://everyuuid.com/

https://github.com/nolenroyalty/every-uuid

johnisgood 1 year ago
How is that infinite if the last one is always the same? Am I misunderstanding this? I assumed it is almost like an infinite scroll or something.
- gruez 1 year ago
  
  Here's another site that does something similar (iterating over bitcoin private keys rather than uuids), but has separate pages and would theoretically catch a crawler:
  https://allprivatekeys.com/all-bitcoin-private-keys-list
  
  1 reply →
diggan 1 year ago
Aren't those finite lists? How is a scraper (normal or LLM) supposed to "get stuck" on those?
- gruez 1 year ago
  
  even though 2^128 uuids is technically "finite", for all intents and purposes is infinite to a scraper.
  
  1 reply →

hartator 1 year ago

Every not found pages that don’t return a 404 http header is basically an infinite trap.

It’s useless to do this though as all crawlers have a way to handle this. It’s very crawler 101.