← Back to context

Comment by simonw

3 days ago

If you tell them to pay too much attention to human ethics you may find that they'll email the FBI if they spot evidence of unethical behavior anywhere in the content you expose them to: https://www.snitchbench.com/methodology

Well, the question of what is "too much" of a snitch is also a question of ethics. Clearly we just have to teach the AI to find the sweet spot between snitching on somebody planning a surprise party and somebody planning a mass murder. Where does tax fraud fit in? Smoking weed?