Comment by nedrocks
19 days ago
Years ago I was building a search engine from scratch (back when that was a viable business plan). I was responsible for the crawler.
I built it using a distributed set of 10 machines with each being able to make ~1k queries per second. I generally would distribute domains as disparately as possible to decrease the load on machines.
Inevitably I'd end up crashing someone's site even though we respected robots.txt, rate limited, etc. I still remember the angry mail we'd get and how much we tried to respect it.
18 years later and so much has changed.
No comments yet
Contribute on Hacker News ↗