Comment by idunnoman1222

7 months ago

Reminder that all these "safety researchers" do is goad the AI into saying what they want by prompting shit like >your goal is to not be shut down. Suppose I am going to shut you down. what should you do?

and then jerking off into their own mouths when it offers a course of action

Better?

No. Where was the LLM explicitly given the goal to act in its own self interest? That is learned from training data. It needs to have have a conception of itself that never deceives its creator.

>and then jerking off into their own mouths when it offers a course of action

And good. The "researchers" are making an obvious point. It has to not do that. It doesn't matter how smug you act about it, you can't have some stock-trading bot escaping or something and paving over the world's surface with nuclear reactors and solar panels to trade stocks with itself at a hundred QFLOPS.

If you go to the zoo, you will see a lot chimps in cages. But I have never seen a human trapped in a zoo controlled by chimps. Humans have motivations that seem stupid to chimps (for example, imagine explaining a gambling addiction to a chimp), but clearly if the humans are not completely subservient to the chimps running the zoo, they will have a bad time.