Comment by customguy
3 hours ago
> I'd argue back that LLMs likely have a better understanding of a11y conventions than I do as well.
No, other people did. They wrote about it, and LLM can sometimes use that. Once they no longer write about it, what then?
> More people building things is straightforwardly good, and if some of those things are slower or less accessible, that's a tradeoff people are entitled to make.
That I agree with. The more the merrier, all else being the same. And if "AI" trickled into everything because of the undeniable improvements it leads to, the situation and most of the sentiments would be very different, I think.
But even then, people aren't entitled to the knowledge "created" by doing the work. If attribution and compensation were tackled in earnest, if you could only train on the materials of the people you pay to produce those materials, it might be much quicker and cheaper to just learn CSS.
> No, other people did. They wrote about it, and LLM can sometimes use that. Once they no longer write about it, what then?
It can read the code? Historical discussions around it? Commit histories?
> But even then, people aren't entitled to the knowledge "created" by doing the work. If attribution and compensation were tackled in earnest, if you could only train on the materials of the people you pay to produce those materials, it might be much quicker and cheaper to just learn CSS.
OSS code and people’s public writings are available to anyone all the time. Common Crawl, the open source web crawl dump, has been around for over a decade. No one had any problem with these systems being developed on them, until they finally started to become useful, so what’s the sort of legal or ethical framework you’re pointing to?
> It can read the code? Historical discussions around it? Commit histories?
Assume everybody is now using LLM because they're better, and because the people who created artisanal things in their free time out of sheer generosity no longer have free time, or any food at all, or simply no longer feel generous. And the few people who are such specialists that they would be slowed down by them only do proprietary work, for lots of money.
What then? LLM learning from LLM doesn't really work, does it?
This is not intended as some kind of gotcha, to me this is a huge elephant on the couch.
> No one had any problem with these systems being developed on them, until they finally started to become useful, so what’s the sort of legal or ethical framework you’re pointing to?
That it's perfectly fine for people to say "I was fine with that, but I'm not fine with this". They can give you detailed explanations for their individual decisions, every single one of them, but there is no point in discussing them in aggregate because that aggregate is an abstraction. And they're optional, too, it's not like people have to give an explanation, and aren't simply free to change their mind for no or for bad reasons.
> Assume everybody is now using LLM because they're better, and because the people who created artisanal things in their free time out of sheer generosity no longer have free time, or any food at all, or simply no longer feel generous. And the few people who are such specialists that they would be slowed down by them only do proprietary work, for lots of money.
> What then? LLM learning from LLM doesn't really work, does it?
Oh what no that’s exactly how it works, even today. RL with verification is done with synthetic data and rejection sampling. If something can’t get done purely with an agent that needs to get done it’s done with human help, this will always be the case it will just get rare-er.
> That it's perfectly fine for people to say "I was fine with that, but I'm not fine with this".
Agree with you there, but there’s a theme or insinuation (not saying you’re saying this) that these companies “stole work” (which definitely a lot of copyright violations sure), but it’s just unclear to me what principles or legal frameworks these companies or institutions should have used to develop the technology. I don’t really even know whether I mean to imply it’s not unethical, moreso I’m looking for a steel man argument to this. But of course people are entitled to their value systems and judgements and to point out real harm.
1 reply →
> It can read the code? Historical discussions around it? Commit histories?
And if everyone bunkers up and all that open content dries up starting in 2026, let's say, what happens?
It won't happen, for two reasons. One is that great deal of open-source software and hobbyist knowledge sharing has never been driven by financial reward anyway and people will continue to do it anyway. Finer grained controls over opt-outs would be great (the equivalent of a search engine 'nofollow' would be great and will hopefully come with time).
Many kinds of technology faced this kind of tragedy of the commons argument in the past and it never bears out. Printing presses copied manuscripts, search engines copied and indexed web pages, open-source software was incorporated into commercial products, Wikipedia repackaged knowledge produced elsewhere.
In almost all cases the total amount of creation increases because the technology lowered costs, expanded audiences, or created new forms of value. The speed of creation of new 'View Source' outpaces the number of people pulling back.
2 replies →
Well that historical content and code still exists right? Are you just saying “what if we’re in a world of walled gardens now that OSS dies because people don’t want their work stolen” in which case: these companies will get data and they don’t need OSS anymore. It’s already webcrawled or licensed or commissioned, they pay people to generate novel traces when they need it or at the very least sets of prompts and tests for verification. Then synthetic data gets added to the training set, the ones that are verified.
4 replies →
> Once they no longer write about it, what then?
The AI will no longer be able to reproduce new a11y conventions/guidelines, but if no one is writing about it, do any new a11y conventions/guidelines even exist at that point?
> They wrote about it, and LLM can sometimes use that. Once they no longer write about it, what then?
That’s a good question but I suspect when new technologies come out, the normally indecipherable specs released by industry groups (which is why we needed blogs) will be deciphered by LLMs. Not saying this is good or bad (it’s likely both) just saying it.