Comment by CamperBob2
1 day ago
The GPL arose from Stallman's frustration at not having access to the source code for a printer driver that was causing him grief.
In a world where he could have just said "Please create a PDP-whatever driver for an IBM-whatever printer," there never would have been a GPL. In that sense AI represents the fulfillment of his vision, not a refutation or violation.
I'd be surprised if he saw it that way, of course.
The safeguards will prevent the AI from reproducing the proprietary drivers for the IBM-whatever printer, and it will not provide code that breaks the DRM that exist to prevent third-party drivers from working with the printer. There will however be no such safeguards or filters to prevent IBM to write a proprietary driver for their next printer, using existing GPL drivers as a building block.
Code will only ever go in one direction here.
Then we'd better stop fighting against AI, and start fighting against so-called "safeguards."
I wish you luck. The music industry basically won their fight in forcing safeguards against AI music. The film industry are gaining laws regulating AI film actors. The code generating AI are only training on freely accessible code and not proprietary code. There is multiple laws being made against AI porn all over the world (or possible already on the books).
What we should fight is Rules For Thee but Not for Me.
1 reply →
"we better stop fighting against CCTVs everywhere and start fighting against them used for indiscriminate surveillance"
10 replies →
But that isn't the same code that you were running before. And like, let's not forget GPLv3: "please give me the code for a mobile OS that could run on an iPhone" does not in any way help me modify the code running on MY iPhone.
Sure it does. Just tell the model to change whatever you want changed. You won't need access to the high-level code, any more than you need access to the CPU's microcode now.
We're a few years away from that, but it will happen unless someone powerful blocks it.
I believe the point was that iPhones don't even allow running custom code even if you have the code; whereas GPLv3 mandates that any conveyed form of a work must be replacable by the user. So unless LLMs easily spit out an infinite stream of 0days to exploit to circumvent that, they won't help here.
In said hypothetical world, though, the whatever-driver would also have been written by LLMs; and, if the printer or whatever is non-trivial and made by a typical large company, many LLM instances with a sizable amount of token spending over a long period of time.
So getting your own LLM rewrite to an equivalent point (or, rather, less buggy as that's the whole point!) would be rather expensive; at the absolute very least, certainly more expensive than if you still had the original source code to reference or modify (even if an LLM is the thing doing those). Having the original source code is still just strictly unconditionally better.
Never mind the question of how you even get your LLM to reverse-engineer & interact with & observe the physical hardware of your printer, and whatever wasted ink during debugging of the reinvention of what the original driver already did correctly.
Now I'm kind of curious if you give an LLM the disassembly of a proprietary firmware blob and tell it to turn it into human-readable source code, how good is it at that?
You could probably even train one to do that in particular. Take existing open source code and its assembly representations as training data and then treat it like a language translation task. Use the context to guess what the variable names were before the original compiler discarded them etc.
The most difficult parts of getting readable code would be dealing with inlined functions and otherwise-duplicated code from macros or similar, and dealing with in-memory structure layouts; both pretty complicated very-global tasks. (never mind naming things, but perhaps LLMs have a good shot at that)
That said, chatgpt currently seems to fail even basic things - completely missed the `thrM` path being possible here: https://chatgpt.com/share/69296a8e-d620-800b-8c25-15f4260c78... https://dzaima.github.io/paste/#0jZJNTsMwEIX3OcWoSFWCqrhN0wb... and that's only basic bog-standard branching, no in-memory structures or stack usage (such trivial problems could be handled by using an actual proper disassembler before throwing an LLM at that wall, but of course that only solves the easy part)
3 replies →
Should be possible. A couple of years ago I used an earlier ChatGPT model to understand and debug some ARM assembly, which I'm not personally very familiar with.
I can imagine that a process like what you describe, where a model is trained specifically on .asm / .c file pairs, would be pretty effective.
1 reply →
The only legal way to do that in the proprietary software world is a clean room implementation.
An AI could never do a clean room implementation of anything, since it was not trained on clean room materials alone. And it never can be, for obvious reasons. I don't think there's an easy way out here.