Comment by voidUpdate
6 hours ago
My preference is that if you need to use terabytes of data to train an LLM, that data should be used according to its copyright, and with the consent of the copyright holder, not just hoovered up from wherever you can find just a few bytes more data
No comments yet
Contribute on Hacker News ↗