Comment by ArinaS

2 months ago

This thing, despite using "captcha" in its name, is not your typical captcha like hCaptcha or Google's one, because it uses a proof-of-work mechanism instead of writing answers in textboxes/clicking on images/other means of verification requiring user input.

AI bots can't solve proof-of-work challenges because browsers they use for scraping don't support features needed to solve them. This is highlighted by existence of other proof-of-work solutions designed to specifically filter out AI bots, like go-away[1] or Anubis[2].

And yes, they work - once GNOME deployed one of these proof-of-work challenges on their gitlab instance, traffic on it fell by 97%[3].

[1] - https://git.gammaspectra.live/git/go-away

[2] - https://github.com/TecharoHQ/anubis

[3] - https://thelibre.news/foss-infrastructure-is-under-attack-by...: "According to Bart Piotrowski, in around two hours and a half they received 81k total requests, and out of those only 3% passed Anubi's proof of work, hinting at 97% of the traffic being bots."

4 comments

ArinaS

diggan 2 months ago

> AI bots can't solve proof-of-work challenges because browsers they use for scraping don't support features needed to solve them. This is highlighted by existence of other proof-of-work solutions designed to specifically filter out AI bots, like go-away[1] or Anubis[2].

Huh, they definitely can?

go-away and Anubis reduces the load on your servers as bot operators cannot just scrape N pages per second without any drawbacks. Instead it gets really expensive to make 1000s of requests, as they're all really slow.

But for a user who uses their own AI agent, that browses the web, things like anubis and go-away aren't meant to (nor does it) stop them from accessing the websites at all, it'll just be a tiny bit slower.

Those tools are meant to stop site-wide scraping, not individual automatic user-agents.

graemep 2 months ago

> AI bots can't solve proof-of-work challenges because browsers they use for scraping don't support features needed to solve them.

At least sometimes. I do not know about AI scraping but there are plenty of scraping solutions that do run JS.

It also puts of some genuine users like me who prefer to keep JS off.

The 97% is only accurate if you assume a zero false positive rate.

ArinaS 2 months ago
> "It also puts of some genuine users like me who prefer to keep JS off."
Non-javascript challenges are also available[1].
> "The 97% is only accurate if you assume a zero false positive rate."
GNOME's gitlab instance is not something people visit daily like Wikipedia, so it's a negligible amount of false positives.
[1] - https://git.gammaspectra.live/git/go-away/wiki/Challenges#no...
- graemep 2 months ago
  
  > Non-javascript challenges are also available
  Did not know that. Good news
  > NOME's gitlab instance is not something people visit daily like Wikipedia, so it's a negligible amount of false positives.
  As an absolute number, yes, but as a proportion?