Comment by ok123456

3 days ago

Why is kernel.org doing this for essentially static content? Cache control headers and ETAGS should solve this. Also, the Linux kernel has solved the C10K problem.

Because its static content that is almost never cached because its infrequently accessed. Thus, almost every hit goes to the origin.

  • The contents in question are statically generated, 1-3 KB HTML files. Hosting a single image would be the equivalent of cold serving 100s of requests.

    Putting up a scraper shield seems like it's more of a political statement than a solution to a real technical problem. It's also antithetical to open collaboration and an open internet of which Linux is a product.

Bots don't respect that.

  • Use a CDN.

    • A great option for most people, and indeed Anubis' README recommends using Cloudflare if possible. However, not everyone can use a paid CDN. Some people can't pay because their payment methods aren't accepted. Some people need to serve content or to countries which a major CDN can't for legal and compliance reasons. Some organizations need their own independent infrastructure to serve their organizational misson.