Comment by kator
5 hours ago
> Yet, this shift made me re-evaluate the open source code publishing. Prior to that, I have been positive about free and open software, and considered this to be the default mode for work such as kefir. I did not require any justifications from myself to publish something. Now, however, I feel more and more that the main beneficiaries of my unpaid work are companies scraping the internet to train large language models. Currently accepted status quo in this area goes against my own intentions in licensing this work under GNU GPLv3. Publication has ceased to be the "null hypothesis" for me, and requires explicit mental justification which I am not able to provide.
I feel this pain, one of my small donation driven sites has been destroyed by crawlers who just ignore robots.txt and burn the site into the ground.
Sort of jokingly I proposed an update to the "spam fax" law:
This is essentially the digital world transforming from a high trust society into a low trust one. Sad to see.
Not even just digital; much of the world is shifting from high trust to low trust as well: https://social.desa.un.org/sites/default/files/inline-files/...
To whom would you attribute the greater part of that reduction in trust: the people using FOSS to train LLMs, or the people trying to block them?
People who break the social contract are the ones responsible for breaking the social contract, not the ones who take steps in response to social contract being broken.
5 replies →
Its definitely the ones DDOSing websites while giving no attribution in any way to the original creators.
2 replies →
Really hate to say it, but I’ve stopped publishing my work too for this reason. I spend most of my time now building my own little software ark, and I aspire to no longer think of programming in the next few years. I feel like the creative economy in general will be unrecognizable in the near future, maybe nonexistent. I wonder what modes of collaboration on ideas might form in the next few years.
Here is what the purveyors of AI don't seem to realise. You can bend copyright law all you want in order to train your models on whatever you can grab, but in the absence of genuine protection of their creative work authors are simply not going to be publishing at all.
I think they see it all too well. They still think they can make bank today while it lasts, whatever comes after is some other shareholder's problem. And if we're talking about open source, killing it might be a positive side effect, they'll be ready to sell you a closed source alternative when you no longer have options.
1 reply →
People who are making stuff because they want to share it are still going to be publishing. And fighting to be noticed in an unending torrent of slop.
1 reply →
Great. More work for AI then.
The sad thing is I feel trapped on all sides of the debate, I wrote a book about LLMs and human creativity (spoiler Humans win for a long time) but I was going to do it as a blog series, instead I published https://www.amazon.com/dp/B0GXCSY4W8 because I felt at least I might get a bit back for literally 100’s of hours of my life I poured into the book and my editor and friends who read and provided reviews.
And I push a lot of open source code including a ton for the SWGEmu project, but now I’m of mixed mind to stop pushing anything public. I can’t decide, am I talking out of both sides of my mouth, it’s a confusing time to navigate for sure.
> The sender pays, not the receiver.
You have a hole here. Your web server is sending the response and the bot is receiving.
Fix that and … profit? :-)
oh good point got that backwards… OMG my fax brain didn’t even think about it.
I'm trying to compose a better wording, but my attempts aren't working. The best I've got is:
> The initiator of the communication pays, not the server operator.