Comment by senko

7 months ago

> it costs real money to serve the site and crawlers are often considered parasitic.

> Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog

I think of crawlers that bulk download/scrape (eg. for training) as distinct from an agent that interacts with a website on behalf of one user.

For example, if I ask an AI to book a hotel reservation, that's - in my mind - different from a bot that scrapes all available accommodation.

For the latter, ideally a common corpus would be created and maintained, AI providers (or upstart search engines) would pay to access this data, and the funds would be distributed to the sites crawled.

(never gonna happen but one can dream...)

1 comment

senko

fragmede 7 months ago

But which hotel reservation? I want my agent to look at all available options and help me pick the best one - location vs price vs quality. How does it do that other than by scanning all available options? (Realistically Expedia has that market on lock, but the hypothetical still remains.)