Comment by zb3 13 days ago * for us, advertisers and our AI models 5 comments zb3 Reply ern_ave 13 days ago My guess is that AI training is the main issue.Data that you can prove was generated by humans is now exceedingly valuable ...and most of that comes from the days before LLMs. The situation is a bit like how steel manufactured before the nuclear age is valuable. adamnemecek 13 days ago But why would people train on excerpts from Google Books when whole books can be downloaded on libgen and such? londons_explore 13 days ago Google books is much bigger than libgen. asdefghyk 13 days ago copyright reasons? 1 reply →
ern_ave 13 days ago My guess is that AI training is the main issue.Data that you can prove was generated by humans is now exceedingly valuable ...and most of that comes from the days before LLMs. The situation is a bit like how steel manufactured before the nuclear age is valuable. adamnemecek 13 days ago But why would people train on excerpts from Google Books when whole books can be downloaded on libgen and such? londons_explore 13 days ago Google books is much bigger than libgen. asdefghyk 13 days ago copyright reasons? 1 reply →
adamnemecek 13 days ago But why would people train on excerpts from Google Books when whole books can be downloaded on libgen and such? londons_explore 13 days ago Google books is much bigger than libgen. asdefghyk 13 days ago copyright reasons? 1 reply →
My guess is that AI training is the main issue.
Data that you can prove was generated by humans is now exceedingly valuable ...and most of that comes from the days before LLMs. The situation is a bit like how steel manufactured before the nuclear age is valuable.
But why would people train on excerpts from Google Books when whole books can be downloaded on libgen and such?
Google books is much bigger than libgen.
copyright reasons?
1 reply →