Comment by MikeDVB

6 months ago

We have been seeing our clients' sites being absolutely *hammered* by AI bots trying to blend in. Some of the bots use invalid user agents - they _look_ valid on the surface, but under the slightest scrutiny, it becomes obvious they're not real browsers.

Personally I have no issues with AI bots, that properly identify themselves, from scraping content as if the site operator doesn't want it to happen they can easily block the offending bot(s).

We built our own proof-of-work challenge that we enable on client sites/accounts as they come under 'attack' and it has been incredible how effective it is. That said I do think it is only a matter of time before the tactics change and these "malicious" AI bots are adapted to look more human / like real browsers.

I mean honestly it wouldn't be _that_ hard to enable them to run javascript or to emulate a real/accurate User-Agent. That said they could even run headless versions of the browser engines...

It's definitely going to be cat-and-mouse.

The most brutal honest truth is that if they throttled themselves as not to totally crash whatever site they're trying to scrape we'd probably have never noticed or gone through the trouble of writing our own proof-of-work challenge.

Unfortunately those writing/maintaining these AI bots that hammer sites to death probably either have no concept of the damage it can do or they don't care.

2 comments

MikeDVB

jchw 6 months ago

> We have been seeing our clients' sites being absolutely hammered by AI bots trying to blend in. Some of the bots use invalid user agents - they _look_ valid on the surface, but under the slightest scrutiny, it becomes obvious they're not real browsers.

Yep. I noticed this too.

> That said they could even run headless versions of the browser engines...

Yes, exactly. To my knowledge that's what's going on with the latest wave that is passing Anubis.

That said, it looks like the solution to that particular wave is going to be to just block Huawei cloud IP ranges for now. I guess a lot of these requests are coming from that direction.

Personally though I think there are still a lot of directions Anubis can go in that might tilt this cat and mouse game a bit more. I have some optimism.

MikeDVB 6 months ago

I haven't seen much if anything getting past our pretty simple proof-of-work challenge but I imagine it's only a matter of time.
Thankfully, so far, it's still been pretty easy to block them by their user agents as well.