Comment by flaburgan
6 days ago
Sure, but at some point the idea is to train an LLM on these downloaded files no? I mean what is the point of getting them if you don't use them. So sure, this won't be interpreted during the crawling but it will become part of the knowledge of the LLM
Training is not inference, there is no reasoning happening then either.
Even if it did have some effect down the line it wouldn't help sites like AA with their scraping problem, which is the issue at hand.