Comment by phillipcarter

8 months ago

> This is certainly true for new AI companies and new content, but the vast majority of scraping useful content has already been completed.

For training a base model, yes, but there's a big category of AI use case: search engine. Those invocations of the model involve web searches, often during reasoning steps, and they will absolutely scrape for content.

1 comment

phillipcarter

nialse 8 months ago

Agreed. The question is if new content is valuable enough? Or, will we see other sources rise to the occasion? Meta, Google, X and ByteDance at least have other sources of current content which they may start to promote "for visibility". If these sources will be sufficient for the reasoning steps is uncertain though.