Comment by ptorrone

16 hours ago

(posting this in both comments about this) i am the author of the article.

the adafruit blog is not trying to block you my dude(s). we are under constant automated scraping and ddos, largely from ai crawlers, and we use cloudflare to keep the site online at all. the nature of of these things will cause false positives depending on browser, extensions, network, or referrer.

the site publishs full-text rss feeds with no blockers here, no ads: https://blog.adafruit.com/rss

the site respects do not track, privacy badger, and similar tools. the site will probably never pass the purity tests for everyone, the goal is to stay independent, publishing, without selling readers or folding into a mega-platform. we're open source and vc free, chill out about us, ok?

if you still can’t get an article and want it in html, markdown, text, or pdf, email me and i’ll send it directly, i will read it on the phone to you, i am not kidding.

we’re trying, and we’ll keep trying. you gotta meet somewhere.

Cloudflare is ridiculous. I can't even open it using Cromite (privacy enhanced, but not over the top, android browser).

I get:

blog.adafruit.com Verifying you are human. This may take a few seconds.

blog.adafruit.com needs to review the security of your connection before proceeding.

And this hangs forever. What difference does it make if I access this site using a browser (blocked anyway) or I asked my LLM to fetch the content? I bet my LLM coukd get it anyway as I'm using basic local scraping with firecrawl for backup. So my LLM if it fails to retrieve using my basic local crawl4ai will use my paid firecrawl api and those guys can scrape EVERYTHING.

I do not understand why do you (as a site owner) care? Are these bots generating so much traffic? Can you set it up to serve text only version to them then?

  • this isn’t an adafruit-specific stance, it’s a web-wide problem. automated scraping and bot traffic is enough to take independent sites offline, and cloudflare is a tool we use to keep the site available at all. we publish full-text rss with no blockers here: https://blog.adafruit.com/rss . if cloudflare trips on your browser and you want an article, email me and i’ll send it in whatever format you want, we're always working to make it easier, it's hard, would rather have help than snarks and dunks.

  • My office uses ZScalar and lots of sites automatically block that because the company running the product make the product seem like an "open anonymous proxy".

Forget it. Every thread has these who people complain when some website doesn't open in Lynx or Amaya with Javascript disabled. Ignore them.

When companies that earn their money by selling things deliberately make their website hard to access (especially for scrapers -- of any sort), then they're making a choice to abandon their customers.

It seems ruthlessly disappointing to consider, but maybe Adafruit isn't cut out for this whole Internet thing.

  • Can you elaborate on the logic that makes preventing scrapers (note, you didn't mention actually hindering accessibility technologies) customer antagonistic?

    • When a product doesn't show up at all using the [potential] customer's chosen tools (whether a search engine like Google, or an LLM like ChatGPT), then that product is invisible.

      An invisible product is one that may as well not exist. When a person can't find it, then they also can't purchase it.

  • (posting this in both comments about this) i am the author of the article. the adafruit blog is not trying to block you my dude(s). we are under constant automated scraping and ddos, largely from ai crawlers, and we use cloudflare to keep the site online at all. the nature of of these things will cause false positives depending on browser, extensions, network, or referrer.

    the site publishs full-text rss feeds with no blockers here, no ads: https://blog.adafruit.com/rss

    the site respects do not track, privacy badger, and similar tools. the site will probably never pass the purity tests for everyone, the goal is to stay independent, publishing, without selling readers or folding into a mega-platform. we're open source and vc free, chill out about us, ok?

    if you still can’t get an article and want it in html, markdown, text, or pdf, email me and i’ll send it directly, i will read it on the phone to you, i am not kidding.

    we’re trying, and we’ll keep trying. you gotta meet somewhere.