Comment by puppycodes
1 day ago
So... this would be fine with them?
Claude: "Are you sure you want me to commit murder?"
User: "Yes"
Or do you mean Human presses button:
Claude: "Do you to commit murder? If so press the button."
User: "I pressed the button"
Claude: "Great! Now lets summarize what we did."
First one
Seems like an absurd distinction to me... Reminds me of "I was just following orders"...
I mean the distinction doesn't really matter
There are many ways to construct HITL UXes. But typically they'd take the form of the first one
I think you're missing the forest for the trees. All Anthropic is saying is that HITL is required before murder, the UX is irrelevant
2 replies →