Comment by khafra
5 days ago
Rice's Theorem says you cannot predict or control the effects of nearly any program on your computer; for example, there's no way to guarantee that running a web browser on arbitrary input will not empty your bank account and donate it all to al-qaeda; but you're running a web browser on potentially attacker-supplied input right now.
I do agree that there's a quantitative difference in predictability between a web browser and a trillion-parameter mass of matrixes and nonlinear activations which is already smarter than most humans in most ways and which we have no idea how to ask what it really wants.
But that's more of an "unsafe at any speed" problem; it's silly to blame the person running the program. When the damage was caused by a toddler pulling a hydrogen bomb off the grocery store shelf, the solution is to get hydrogen bombs out of grocery stores (or, if you're worried about staying competitive with Chinese grocery stores, at least make our own carry adequate insurance for the catastrophes or something).
In practice, most programs can be predicted within reasonable bounds quite easily. And you can contain the external effects of most programs quite easily. Rice's theorem doesn't stop you from keeping a program off the Internet, or running it in a VM.
Your later comparisons are nonsense. We're not talking about babies, we're talking about adults who should know better assembling high leverage tools specifically to interact with other people's lives. If they were even running with oversight that would be something, but the operators are just letting them do whatever. But your implication that agents are "unsafe at any speed" leads to the same conclusion: do not run the program.
I guess today's kids don't know this; but "Unsafe at Any Speed" was the title of a 1965 book that spurred the creation of the Department of Transportation, and changed the automotive industry.
The point is that, if you're designing and selling a product which a large minority of people are going to use in a way that harms themselves and others, pointing at the users and calling them irresponsible doesn't actually help anybody. The people designing and selling the products actually need to make them safer. And if they're not going to do that voluntarily (they're not), we need the government to create insurance requirements, safety bonds, and whatever other incentive gradients are required to make the producers build safe products.
I caught the reference. To the extent it applies at all, I obviously think it reinforces my point. But badly engineered cars, members of an existing category that we know can be done tolerably well, are a very strained analogy to brand new software deployed by people who understand how it works and therefore the risks they are taking.
And actually, the deployer has a lot more control over the havoc the software can cause than the creator. They choose what credentials to give it, whether and how closely to monitor it, any other guardrails, etc. If the operator of the bot discussed in OP had intervened soon after it went off the rails, we wouldn't be here.
So sure, I would also tell the makers of this software to knock it off. Don't put out products that are the network equivalent of a chainsaw on a roomba, no matter how many cool tiktoks it creates. But when I'm talking to people running claws or whatever, they no longer have the excuse of ignorance. So the advice is still: Do not run the program.
Blaming the person running the program is the right thing to do and it's the only thing to do.
This is a really strained equivalence. I can't know for certain that the sun won't fall out of the sky if I drink a second cup of coffee. The "laws of physics" are just descriptions based on observations, after all. But it's a hilarious thing so unlikely we can call it impossible.
Similarly, we can have some nuance here. Someone running a program with the intention of it generating posts on the internet is obviously responsible for what it generates.
Rice's Thm does not say this. You can absolutely have 100% confident knowledge of what a program will not do, it just means that you also have false positives. You cannot have a both sound and complete static analysis for some program property. But you can have a sound or complete analysis.