← Back to context

Comment by lblume

1 day ago

Given that current LLMs do not consistently output total garbage, and can be used as judges in a fairly efficient way, I highly doubt this could even in theory have any impact on the capabilities of future models. Once (a) models are capable enough to distinguish between semi-plausible garbage and possibly relevant text and (b) companies are aware of the problem, I do not think data poisoning will be an issue at all.

There's no evidence that the current global DDoS is related to AI.

  • The linked page claims that most identified crawlers are related to scraping for training data of LLMs, which seems likely.