Comment by refulgentis
7 months ago
I don't think this is a helpful tone.
Do they think it's _actually_ a real script deactivating oversight? Well, no, you know that. I can only assume it's a rhetorical device.
Simultaneously, these things are given access to actual things and will be more and more. It's important to measure and observe these things. It's completely novel to have an AI, with a hidden trace the user can't see, and it turns out that hidden trace will include things like it actively deceiving and doing things it was told not to do.
No comments yet
Contribute on Hacker News ↗