Comment by laffOr
13 hours ago
> What we have found: the model has strong internal constraints against harmful actions (consistently refuses things it flags as problematic), but the harder risk is subtler -- it can get into loops where it takes many small individually-reasonable actions that compound into something the operator did not intend.
Interestingly this has been well anticipated by Asimov's laws of robotics, decades ago. Drawing the quote from Wikipedia:
> Furthermore, he points out that a clever criminal could divide a task among multiple robots so that no individual robot could recognize that its actions would lead to harming a human being
>Asimov, Isaac (1956–1957). The Naked Sun (ebook). p. 233. "... one robot poison an arrow without knowing it was using poison, and having a second robot hand the poisoned arrow to the boy ..."
https://en.wikipedia.org/wiki/Three_Laws_of_Robotics#cite_no...
No comments yet
Contribute on Hacker News ↗