Comment by Leynos
11 days ago
Is this companies collecting data for model training, or is it agentic tools operating on behalf of users?
11 days ago
Is this companies collecting data for model training, or is it agentic tools operating on behalf of users?
I think in the grand scheme of things barely anyone uses agents (as of now) to crawl sites quickly apart from maybe a quick Google search or two. At least that's been my observation of my non-technical field friends using LLMs.
From what it looks like in the web logs it is in fact the same few AI company web crawlers constantly crawling and recrawling the same URLs over and over, presumably to get even the slightest advantage over each other, as they are definitely in the zero-sum mindset currently.
Whatever it is, I've seen the commons abuse Gitlab servers so hard they peg 64 high wattage server cores 24/7. Installing mitigations cut their power bill in half.